Python Iterate Through Connected Components In Grayscale Image
Solution 1:
You can use networkx:
from itertools import product, repeat
import numpy as np
import networkx as nx
arr = np.array(
[[255,255,255,255,255,255,255],
[255,255, 0 ,10 ,255,255, 1 ],
[255,30 ,255,255,50 ,255, 9 ],
[51 ,20 ,255,255, 9 ,255,240],
[255,255,80 ,50 ,170,255, 20],
[255,255,255,255,255,255, 0 ],
[255,255,255,255,255,255, 69]])
# generate edges
shift = list(product(*repeat([-1, 0, 1], 2)))
x_max, y_max = arr.shape
edges = []
for x, y in np.ndindex(arr.shape):
for x_delta, y_delta in shift:
x_neighb = x + x_delta
y_neighb = y + y_delta
if (0 <= x_neighb < x_max) and (0 <= y_neighb < y_max):
edge = (x, y), (x_neighb, y_neighb)
edges.append(edge)
# build graph
G = nx.from_edgelist(edges)
# draw graph
pos = {(x, y): (y, x_max-x) for x, y in G.nodes()}
nx.draw(G, with_labels=True, pos=pos, node_color='coral', node_size=1000)
# draw graph with numbers
labels = dict(np.ndenumerate(arr))
node_color = ['coral'if labels[n] == 255else'lightgrey'for n in G.nodes()]
nx.draw(G, with_labels=True, pos=pos, labels=labels, node_color=node_color, node_size=1000)
# build subgraphselect = np.argwhere(arr < 255)
G1 = G.subgraph(map(tuple, select))
# draw subgraphpos = {(x, y): (y, x_max-x) forx, y in G1.nodes()}
labels1 = {n:labels[n] for n in G1.nodes()}
nx.draw(G1, with_labels=True, pos=pos, labels=labels1, node_color='lightgrey', node_size=1000)
# find connected components and DFS trees
for i in nx.connected_components(G1):
source = next(iter(i))
idx = nx.dfs_tree(G1, source=source)
print(arr[tuple(np.array(idx).T)])
Output:
[ 0 10 50 9 50 80 20 30 51 170][ 9 1 240 20 0 69]Solution 2:
So after so much researches for suitable implementation of connected components, I came up with my solution. In order to reach the best I can do in terms of performance, I relied on these rules:
- Not to use
networkxbecause it's slow according to this benchmark - Use vectorized actions as much as possible because Python based iterations are slow according to this answer.
I'm implementing an algorithm of connected components of image here only because I believe this is an essential part of this question.
Algorithm of connected components of image
import numpy as np
import numexpr as ne
import pandas as pd
import igraph
defget_coords(arr):
x, y = np.indices(arr.shape)
mask = arr != 255return np.array([x[mask], y[mask]]).T
defcompare(r1, r2):
#assuming r1 is a sorted array, returns:# 1) locations of r2 items in r1# 2) mask array of these locations
idx = np.searchsorted(r1, r2)
idx[idx == len(r1)] = 0
mask = r1[idx] == r2
return idx, mask
defget_reduction(coords, s):
d = {'s': s, 'c0': coords[:,0], 'c1': coords[:,1]}
return ne.evaluate('c0*s+c1', d)
defget_bounds(coords, increment):
return np.max(coords[1]) + 1 + increment
defget_shift_intersections(coords, shifts):
# instance that consists of neighbours found for each node [[0,1,2],...]
s = get_bounds(coords, 10)
rdim = get_reduction(coords, s)
shift_mask, shift_idx = [], []
for sh in shifts:
sh_rdim = get_reduction(coords + sh, s)
sh_idx, sh_mask = compare(rdim, sh_rdim)
shift_idx.append(sh_idx)
shift_mask.append(sh_mask)
return np.array(shift_idx).T, np.array(shift_mask).T,
defconnected_components(coords, shifts):
shift_idx, shift_mask = get_shift_intersections(coords, shifts)
x, y = np.indices((len(shift_idx), len(shift_idx[0])))
vertices = np.arange(len(coords))
edges = np.array([x[shift_mask], shift_idx[shift_mask]]).T
graph = igraph.Graph()
graph.add_vertices(vertices)
graph.add_edges(edges)
graph_tags = graph.clusters().membership
values = pd.DataFrame(graph_tags).groupby([0]).indices
return values
coords = get_coords(arr)
shifts=((0,1),(1,0),(1,1),(-1,1))
comps = connected_components(coords, shifts=shifts)
for c in comps:
print(coords[comps[c]].tolist())
Outcome
[[1, 2], [1, 3], [2, 1], [2, 4], [3, 0], [3, 1], [3, 4], [4, 2], [4, 3], [4, 4]]
[[1, 6], [2, 6], [3, 6], [4, 6], [5, 6], [6, 6]]
Explanation
Algorithm consists of these steps:
We need to convert image to coordinates of non-white cells. It can be done using function:
def get_coords(arr): x, y = np.indices(arr.shape) mask = arr != 255return np.array([y[mask], x[mask]]).TI'll name an outputting array by
Xfor clarity. Here is an output of this array, visually:Next, we need to consider all the cells of each shift that intersects with
X:In order to do that, we should solve a problem of intersections I posted few days before. I found it quite difficult to do using multidimensional
numpyarrays. Thanks to Divakar, he proposes a nice way of dimensionality reduction usingnumexprpackage which fastens operations even more thannumpy. I implement it here in this function:defget_reduction(coords, s): d = {'s': s, 'c0': coords[:,0], 'c1': coords[:,1]} return ne.evaluate('c0*s+c1', d)In order to make it working, we should set a bound
swhich can be calculated automatically using a functiondefget_bounds(coords, increment): return np.max(coords[1]) + 1 + incrementor inputted manually. Since algorithm requires increasing coordinates, pairs of coordinates might be out of bounds, therefore I have used a slight
incrementhere. Finally, as a solution to my post I mentioned here, indexes of coordinates ofX(reduced to 1D), that intersects with any other array of coordinatesY(also reduced to 1D) can be accessed via functiondefcompare(r1, r2): # assuming r1 is a sorted array, returns:# 1) locations of r2 items in r1# 2) mask array of these locations idx = np.searchsorted(r1, r2) idx[idx == len(r1)] = 0 mask = r1[idx] == r2 return idx, maskPlugging all the corresponding
arraysof shifts. As we can see, abovementioned function outputs two variables: an array of index locations in the main setXand its mask array. A proper indexes can be found usingidx[mask]and since this procedure is being applied for each shift, I implementedget_shift_intersections(coords, shifts)method for this case.Final: constructing nodes & edges and taking output from
igraph. The point here is thatigraphperforms well only with nodes that are consecutive integers starting from 0. That's why my script was designed to use mask-based access to item locations inX. I'll explain briefly how did I useigraphhere:I have calculated coordinate pairs:
[[1, 2], [1, 3], [1, 6], [2, 1], [2, 4], [2, 6], [3, 0], [3, 1], [3, 4], [3, 6], [4, 2], [4, 3], [4, 4], [4, 6], [5, 6], [6, 6]]Then I assigned integers for them:
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]My edges looks like this:
[[0, 1], [1, 4], [2, 5], [3, 7], [3, 0], [4, 8], [5, 9], [6, 7], [6, 3], [7, 10], [8, 12], [9, 13], [10, 11], [11, 12], [11, 8], [13, 14], [14, 15]]Output of
graph.clusters().membershiplooks like this:[0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1]And finally, I have used
groupbymethod ofPandasto find indexes of separate groups (I use Pandas here because I found it to be the most efficient way of grouping in Python)
Notes
Download of igraphis not straightforward, you might need to install it from unofficial binaries.





Post a Comment for "Python Iterate Through Connected Components In Grayscale Image"