A few basic data sources and sinks are built into Flink and are always available. The [predefined data sources](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/datastream_api.html#data-sources) include reading from files, directories, and sockets, and ingesting data from collections and iterators. The [predefined data sinks](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/datastream_api.html#data-sinks) support writing to files, to stdout and stderr, and to sockets.
Connectors provide code for interfacing with various third-party systems. Currently these systems are supported:
连接器提供用于与各种第三方系统接口的代码。当前支持以下系统:
*[Apache Kafka](kafka.html)(source/sink)
*[Apache Cassandra](cassandra.html)(sink)
...
...
@@ -20,10 +22,12 @@ Connectors provide code for interfacing with various third-party systems. Curren
*[Twitter Streaming API](twitter.html)(source)
Keep in mind that to use one of these connectors in an application, additional third party components are usually required, e.g. servers for the data stores or message queues. Note also that while the streaming connectors listed in this section are part of the Flink project and are included in source releases, they are not included in the binary distributions. Further instructions can be found in the corresponding subsections.
Using a connector isn’t the only way to get data in and out of Flink. One common pattern is to query an external database or web service in a `Map` or `FlatMap` in order to enrich the primary datastream. Flink offers an API for [Asynchronous I/O](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/asyncio.html) to make it easier to do this kind of enrichment efficiently and robustly.
When a Flink application pushes a lot of data to an external data store, this can become an I/O bottleneck. If the data involved has many fewer reads than writes, a better approach can be for an external application to pull from Flink the data it needs. The [Queryable State](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/queryable_state.html) interface enables this by allowing the state being managed by Flink to be queried on demand.