未验证 提交 0f0b1e96 编写于 作者: rictjo's avatar rictjo 提交者: GitHub

graphs++

上级 7b0bcb40
......@@ -504,9 +504,10 @@ the definition for the mitochondrion is fully contained within the melanosome me
[DBSCAN](https://en.wikipedia.org/wiki/DBSCAN) is a clustering algorithm that can be seen as a way of rejecting points, from any cluster, that are positioned in low dense regions of a point cloud. This introduces holes and may result in a larger segment, that would otherwise be connected via a non dense link to become disconnected and form two segments, or clusters. The rejection criterion is simple. The central concern is to evaluate a distance matrix <img src="https://render.githubusercontent.com/render/math?math=D_{ij}"> with an applied cutoff <img src="https://render.githubusercontent.com/render/math?math=\epsilon"> this turns the distances into true or false values depending on if a pair distance between point i and j is within the distance cutoff. This new binary Neighbour matrix <img src="https://render.githubusercontent.com/render/math?math=N_{ij}=D_{ij}\le\epsilon"> tells you wether or not two points are neighbours (including itself). The DBSCAN criterion states that a point is not part of any cluster if it has fewer than `minPts` neighbors. Once you've calculated the distance matrix you can immediately evaluate the number of neighbors each point has and the rejection criterion, via <img src="https://render.githubusercontent.com/render/math?math=R_i=(\sum_{j} D_{ij}\le\epsilon)-1 < minPts">. If the rejection vector R value of a point is True then all the pairwise distances in the distance matrix of that point is set to a value larger than epsilon. This ensures that a distance matrix search will reject those points as neighbours of any other for the choosen epsilon. By tracing out all points that are neighbors and assessing the [connectivity](https://github.com/richardtjornhammar/impetuous/blob/master/src/impetuous/clustering.py) (search for connectivity) you can find all the clusters.
In this [example](https://gist.githubusercontent.com/richardtjornhammar/e84056e0b10f8d550258a1e8944ee375/raw/e44e7226b6cb8ca486ff539ccfa775be981a549c/example9.py) we do exactly this for two gaussian point clouds. The dbscan search is just a single line `dbscan ( data_frame = point_cloud_df , eps=0.45 , minPts=4 )`, while the last lines are there to plot the [results](https://bl.ocks.org/richardtjornhammar/raw/0cc0ff037e88c76a9d65387155674fd1/?raw=true) ( has [graph revision dates](https://gist.github.com/richardtjornhammar/0cc0ff037e88c76a9d65387155674fd1/revisions) )
In this [example](https://gist.githubusercontent.com/richardtjornhammar/e84056e0b10f8d550258a1e8944ee375/raw/e44e7226b6cb8ca486ff539ccfa775be981a549c/example9.py) we do exactly this for two gaussian point clouds. The dbscan search is just a single line `dbscan ( data_frame = point_cloud_df , eps=0.45 , minPts=4 )`, while the last lines are there to plot the [results](https://richardtjornhammar.github.io/?https://gist.githubusercontent.com/richardtjornhammar/0cc0ff037e88c76a9d65387155674fd1/raw/f8c740cd648247889f21eeaadb7b7c74577540be/index.html) ( has [graph revision dates](https://gist.github.com/richardtjornhammar/0cc0ff037e88c76a9d65387155674fd1/revisions) )
The [radial distribution function](https://en.wikipedia.org/wiki/Radial_distribution_function) is a useful tool for visualizing whether or not there are radial clustering tendencies at any average distance between the group of interest and any other constituents of the system. This structure assessment method is usually used for [analysis](https://gist.githubusercontent.com/richardtjornhammar/33162d3be1e92f1b1fafbd9e46954e91/raw/c0685bb79527c947213ffe08973d3ea4e072257e/argon.py) of particle systems, i.e. see [liquid structure](https://bl.ocks.org/richardtjornhammar/raw/bc1e9a8b4c693a338ef812a74ab685e9/?raw=true). It is implemented in the `clustering` module and is demonstrated [here](https://gist.githubusercontent.com/richardtjornhammar/f25ec2eef0703f07ebc0d678123f450e/raw/b9ac597a9d2587727af3cb06a8090ad0eaf0ba49/example10.py). If there is a significant density close to `r=0` then you cannot separate the group from the mean background. This also means that any significance test between those groups will tell you that the grouping is insignificant. The [resulting plot](https://bl.ocks.org/richardtjornhammar/raw/ff417450790c8c885b077fc7ee20409d/?raw=true) has [revision dates](https://gist.github.com/richardtjornhammar/ff417450790c8c885b077fc7ee20409d/revisions). Since the radial distribution function calculates the spherically symmetric distribution of points surrounding an analyte, or analyte group, of interest it is effectively analogous to segmenting the distance matrix and leaving out any self interaction distances that may or may not be present.
The [radial distribution function](https://en.wikipedia.org/wiki/Radial_distribution_function) is a useful tool for visualizing whether or not there are radial clustering tendencies at any average distance between the group of interest and any other constituents of the system. This structure assessment method is usually used for [analysis](https://gist.githubusercontent.com/richardtjornhammar/33162d3be1e92f1b1fafbd9e46954e91/raw/c0685bb79527c947213ffe08973d3ea4e072257e/argon.py) of particle systems, i.e. see [liquid structure](
https://richardtjornhammar.github.io/?https://gist.githubusercontent.com/richardtjornhammar/bc1e9a8b4c693a338ef812a74ab685e9/raw/5276ce75960fa99b5a80972b9e187dc2df29831b/index.html). It is implemented in the `clustering` module and is demonstrated [here](https://gist.githubusercontent.com/richardtjornhammar/f25ec2eef0703f07ebc0d678123f450e/raw/b9ac597a9d2587727af3cb06a8090ad0eaf0ba49/example10.py). If there is a significant density close to `r=0` then you cannot separate the group from the mean background. This also means that any significance test between those groups will tell you that the grouping is insignificant. The [resulting plot](https://richardtjornhammar.github.io/?https://gist.githubusercontent.com/richardtjornhammar/ff417450790c8c885b077fc7ee20409d/raw/65554165ffddf309272a14d6aba1e3fac9fa1a13/index.html) has [revision dates](https://gist.github.com/richardtjornhammar/ff417450790c8c885b077fc7ee20409d/revisions). Since the radial distribution function calculates the spherically symmetric distribution of points surrounding an analyte, or analyte group, of interest it is effectively analogous to segmenting the distance matrix and leaving out any self interaction distances that may or may not be present.
The functions `select_from_distance_matrix` uses boolean indexing to select rows and columns (it is symmetric) in the distance matrix and the `exclusive_pdist` function calculates all pairs between the points in the two separate groups.
......@@ -925,7 +926,7 @@ So it is clear that a linkage method is more efficient for constructing complete
These examples were meant as illustrations of some of the codes implemented in the impetuous-gfa package.
The impetuous visualisation codes requires [Bokeh](https://docs.bokeh.org/en/latest/index.html) and are still being migrated to work with the latest Bokeh versions. For an example of the dynamic `triplot` routine (you can click on the lefthand and bottom scatter points) you can view it [here](https://bl.ocks.org/richardtjornhammar/raw/463e1aa3faceb95a4f894351f16b215a/?raw=true) ( with [revision dates](https://gist.github.com/richardtjornhammar/463e1aa3faceb95a4f894351f16b215a/revisions) or download it [here](https://gist.github.com/richardtjornhammar/463e1aa3faceb95a4f894351f16b215a/) ).
The impetuous visualisation codes requires [Bokeh](https://docs.bokeh.org/en/latest/index.html) and are still being migrated to work with the latest Bokeh versions. For an example of the dynamic `triplot` routine (you can click on the lefthand and bottom scatter points) you can view it [here](https://rictjo.github.io/?https://gist.githubusercontent.com/richardtjornhammar/e6a7570140bd1216a5681c9d96ff157a/raw/6c9230f06b98c1226c6de455816a381c140236c8/index.html) ( with [revision dates](https://gist.github.com/richardtjornhammar/e6a7570140bd1216a5681c9d96ff157a/revisions) or download it [here](https://gist.github.com/richardtjornhammar/e6a7570140bd1216a5681c9d96ff157a/) ).
Some of the algorithms rely on the SVD implementation in Numpy. A switch is planned for the future.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册