未验证 提交 2240f202 编写于 作者: H hold2010 提交者: GitHub

Update 4.md

上级 46203912
# Plotting with categorical data
# 可视化分类数据
[绘制关系图](relational.html#relational-tutorial)的教程中,我们学习了如何使用不同的可视化方法来展示数据集中多个变量之间的关系。在示例中,我们专注于两个数值变量之间的主要关系。如果其中一个主要变量是“可分类的”(能被分为不同的组),那么我们可以使用更专业的可视化方法。
......@@ -46,7 +46,7 @@ sns.catplot(x="day", y="total_bill", data=tips);
![http://seaborn.pydata.org/_images/categorical_4_0.png](img/8b6d8073c4f09f7584b0ff62e5d28d0d.jpg)
`jitter`参数控制抖动的大小完全禁用它:
`jitter`参数控制抖动的大小,你也可以完全禁用它:
```py
sns.catplot(x="day", y="total_bill", jitter=False, data=tips);
......@@ -201,11 +201,11 @@ sns.swarmplot(x="day", y="total_bill", color="k", size=3, data=tips, ax=g.ax);
## 类别内的统计估计
For other applications, rather than showing the distribution within each category, you might want to show an estimate of the central tendency of the values. Seaborn has two main ways to show this information. Importantly, the basic API for these functions is identical to that for the ones discussed above.
对于其他应用程序,您可能希望显示值的集中趋势估计,而不是显示每个类别中的分布。Seaborn有两种主要方式来显示这些信息。重要的是,这些功能的基本API与上面讨论的API相同。
### Bar plots
### 条形图
A familiar style of plot that accomplishes this goal is a bar plot. In seaborn, the [`barplot()`](../generated/seaborn.barplot.html#seaborn.barplot "seaborn.barplot") function operates on a full dataset and applies a function to obtain the estimate (taking the mean by default). When there are multiple observations in each category, it also uses bootstrapping to compute a confidence interval around the estimate and plots that using error bars:
实现这一目标的是我们熟悉的条形图。在seaborn中,[`barplot()`](../generated/seaborn.barplot.html#seaborn.barplot "seaborn.barplot")函数在完整数据集上运行并应用函数来获取估计值(默认情况下取平均值)。 当每个类别中有多个观察值时,它还使用自举来计算估计值周围的置信区间,并使用误差条绘制:
```py
titanic = sns.load_dataset("titanic")
......@@ -215,7 +215,7 @@ sns.catplot(x="sex", y="survived", hue="class", kind="bar", data=titanic);
![http://seaborn.pydata.org/_images/categorical_36_0.png](img/727bcad15c428cfdd74d27db87677157.jpg)
A special case for the bar plot is when you want to show the number of observations in each category rather than computing a statistic for a second variable. This is similar to a histogram over a categorical, rather than quantitative, variable. In seaborn, it’s easy to do so with the [`countplot()`](../generated/seaborn.countplot.html#seaborn.countplot "seaborn.countplot") function:
条形图的一个特例是当你想要显示每个类别中的观察数量而不是计算第二个变量的统计数据时。 这类似于分类而非定量变量的直方图。在seaborn中,使用[`countplot()`](../generated/seaborn.countplot.html#seaborn.countplot "seaborn.countplot")函数很容易实现:
```py
sns.catplot(x="deck", kind="count", palette="ch:.25", data=titanic);
......@@ -224,7 +224,7 @@ sns.catplot(x="deck", kind="count", palette="ch:.25", data=titanic);
![http://seaborn.pydata.org/_images/categorical_38_0.png](img/9788e48ccbd895c1e539c7de634be115.jpg)
Both [`barplot()`](../generated/seaborn.barplot.html#seaborn.barplot "seaborn.barplot") and [`countplot()`](../generated/seaborn.countplot.html#seaborn.countplot "seaborn.countplot") can be invoked with all of the options discussed above, along with others that are demonstrated in the detailed documentation for each function:
无论是[`barplot()`](../generated/seaborn.barplot.html#seaborn.barplot "seaborn.barplot")还是[`countplot()`](../generated/seaborn.countplot.html#seaborn.countplot "seaborn.countplot"),都可以使用上面讨论的所有选项,以及每个函数的详细文档中演示的其他选项调用:
```py
sns.catplot(y="deck", hue="class", kind="count",
......@@ -237,7 +237,7 @@ sns.catplot(y="deck", hue="class", kind="count",
### Point plots
An alternative style for visualizing the same information is offered by the [`pointplot()`](../generated/seaborn.pointplot.html#seaborn.pointplot "seaborn.pointplot") function. This function also encodes the value of the estimate with height on the other axis, but rather than showing a full bar, it plots the point estimate and confidence interval. Additionally, [`pointplot()`](../generated/seaborn.pointplot.html#seaborn.pointplot "seaborn.pointplot") connects points from the same `hue` category. This makes it easy to see how the main relationship is changing as a function of the hue semantic, because your eyes are quite good at picking up on differences of slopes:
[`pointplot()`](../generated/seaborn.pointplot.html#seaborn.pointplot "seaborn.pointplot")函数提供了另一种可视化相同信息的样式。此函数还对另一个轴上的高度估计值进行编码,但不是显示一个完整的条形图,而是绘制点估计值和置信区间。另外,[`pointplot()`](../generated/seaborn.pointplot.html#seaborn.pointplot "seaborn.pointplot")连接来自相同`hue`类别的点。这使得很容易看出主要关系如何随着色调语义的变化而变化,因为你的眼睛非常擅长了解斜率的差异:
```py
sns.catplot(x="sex", y="survived", hue="class", kind="point", data=titanic);
......@@ -246,7 +246,7 @@ sns.catplot(x="sex", y="survived", hue="class", kind="point", data=titanic);
![http://seaborn.pydata.org/_images/categorical_42_0.png](img/687f85f5e0e06a68e6150c0533ff1748.jpg)
When the categorical functions lack the `style` semantic of the relational functions, it can still be a good idea to vary the marker and/or linestyle along with the hue to make figures that are maximally accessible and reproduce well in black and white:
当分类函数缺少关系函数中的`style`语义时, 将标记和/或线条样式与色调一起改变以制作最大可访问的图形并在黑白中重现良好仍然是一个好主意:
```py
sns.catplot(x="class", y="survived", hue="sex",
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册