Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
3 of 225 committers (1.3%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.9%) to scientific vocabulary
Keywords
kafka
python
Keywords from Contributors
distributed
data-mining
unit-testing
transformer
closember
agents
parallel
cryptocurrencies
large-language-models
shellcode
Last synced: 6 months ago
·
JSON representation
Repository
Python client for Apache Kafka
Basic Info
- Host: GitHub
- Owner: dpkp
- License: apache-2.0
- Language: Python
- Default Branch: master
- Homepage: http://kafka-python.readthedocs.io/
- Size: 5.65 MB
Statistics
- Stars: 5,794
- Watchers: 142
- Forks: 1,441
- Open Issues: 43
- Releases: 61
Topics
kafka
python
Created over 13 years ago
· Last pushed 7 months ago
Metadata Files
Readme
Changelog
License
Support
Authors
README.rst
Kafka Python client
------------------------
.. image:: https://img.shields.io/badge/kafka-4.0--0.8-brightgreen.svg
:target: https://kafka-python.readthedocs.io/en/master/compatibility.html
.. image:: https://img.shields.io/pypi/pyversions/kafka-python.svg
:target: https://pypi.python.org/pypi/kafka-python
.. image:: https://coveralls.io/repos/dpkp/kafka-python/badge.svg?branch=master&service=github
:target: https://coveralls.io/github/dpkp/kafka-python?branch=master
.. image:: https://img.shields.io/badge/license-Apache%202-blue.svg
:target: https://github.com/dpkp/kafka-python/blob/master/LICENSE
.. image:: https://img.shields.io/pypi/dw/kafka-python.svg
:target: https://pypistats.org/packages/kafka-python
.. image:: https://img.shields.io/pypi/v/kafka-python.svg
:target: https://pypi.org/project/kafka-python
.. image:: https://img.shields.io/pypi/implementation/kafka-python
:target: https://github.com/dpkp/kafka-python/blob/master/setup.py
Python client for the Apache Kafka distributed stream processing system.
kafka-python is designed to function much like the official java client, with a
sprinkling of pythonic interfaces (e.g., consumer iterators).
kafka-python is best used with newer brokers (0.9+), but is backwards-compatible with
older versions (to 0.8.0). Some features will only be enabled on newer brokers.
For example, fully coordinated consumer groups -- i.e., dynamic partition
assignment to multiple consumers in the same group -- requires use of 0.9+ kafka
brokers. Supporting this feature for earlier broker releases would require
writing and maintaining custom leadership election and membership / health
check code (perhaps using zookeeper or consul). For older brokers, you can
achieve something similar by manually assigning different partitions to each
consumer instance with config management tools like chef, ansible, etc. This
approach will work fine, though it does not support rebalancing on failures.
See https://kafka-python.readthedocs.io/en/master/compatibility.html
for more details.
Please note that the master branch may contain unreleased features. For release
documentation, please see readthedocs and/or python's inline help.
.. code-block:: bash
$ pip install kafka-python
KafkaConsumer
*************
KafkaConsumer is a high-level message consumer, intended to operate as similarly
as possible to the official java client. Full support for coordinated
consumer groups requires use of kafka brokers that support the Group APIs: kafka v0.9+.
See https://kafka-python.readthedocs.io/en/master/apidoc/KafkaConsumer.html
for API and configuration details.
The consumer iterator returns ConsumerRecords, which are simple namedtuples
that expose basic message attributes: topic, partition, offset, key, and value:
.. code-block:: python
from kafka import KafkaConsumer
consumer = KafkaConsumer('my_favorite_topic')
for msg in consumer:
print (msg)
.. code-block:: python
# join a consumer group for dynamic partition assignment and offset commits
from kafka import KafkaConsumer
consumer = KafkaConsumer('my_favorite_topic', group_id='my_favorite_group')
for msg in consumer:
print (msg)
.. code-block:: python
# manually assign the partition list for the consumer
from kafka import TopicPartition
consumer = KafkaConsumer(bootstrap_servers='localhost:1234')
consumer.assign([TopicPartition('foobar', 2)])
msg = next(consumer)
.. code-block:: python
# Deserialize msgpack-encoded values
consumer = KafkaConsumer(value_deserializer=msgpack.loads)
consumer.subscribe(['msgpackfoo'])
for msg in consumer:
assert isinstance(msg.value, dict)
.. code-block:: python
# Access record headers. The returned value is a list of tuples
# with str, bytes for key and value
for msg in consumer:
print (msg.headers)
.. code-block:: python
# Read only committed messages from transactional topic
consumer = KafkaConsumer(isolation_level='read_committed')
consumer.subscribe(['txn_topic'])
for msg in consumer:
print(msg)
.. code-block:: python
# Get consumer metrics
metrics = consumer.metrics()
KafkaProducer
*************
KafkaProducer is a high-level, asynchronous message producer. The class is
intended to operate as similarly as possible to the official java client.
See https://kafka-python.readthedocs.io/en/master/apidoc/KafkaProducer.html
for more details.
.. code-block:: python
from kafka import KafkaProducer
producer = KafkaProducer(bootstrap_servers='localhost:1234')
for _ in range(100):
producer.send('foobar', b'some_message_bytes')
.. code-block:: python
# Block until a single message is sent (or timeout)
future = producer.send('foobar', b'another_message')
result = future.get(timeout=60)
.. code-block:: python
# Block until all pending messages are at least put on the network
# NOTE: This does not guarantee delivery or success! It is really
# only useful if you configure internal batching using linger_ms
producer.flush()
.. code-block:: python
# Use a key for hashed-partitioning
producer.send('foobar', key=b'foo', value=b'bar')
.. code-block:: python
# Serialize json messages
import json
producer = KafkaProducer(value_serializer=lambda v: json.dumps(v).encode('utf-8'))
producer.send('fizzbuzz', {'foo': 'bar'})
.. code-block:: python
# Serialize string keys
producer = KafkaProducer(key_serializer=str.encode)
producer.send('flipflap', key='ping', value=b'1234')
.. code-block:: python
# Compress messages
producer = KafkaProducer(compression_type='gzip')
for i in range(1000):
producer.send('foobar', b'msg %d' % i)
.. code-block:: python
# Use transactions
producer = KafkaProducer(transactional_id='fizzbuzz')
producer.init_transactions()
producer.begin_transaction()
future = producer.send('txn_topic', value=b'yes')
future.get() # wait for successful produce
producer.commit_transaction() # commit the transaction
producer.begin_transaction()
future = producer.send('txn_topic', value=b'no')
future.get() # wait for successful produce
producer.abort_transaction() # abort the transaction
.. code-block:: python
# Include record headers. The format is list of tuples with string key
# and bytes value.
producer.send('foobar', value=b'c29tZSB2YWx1ZQ==', headers=[('content-encoding', b'base64')])
.. code-block:: python
# Get producer performance metrics
metrics = producer.metrics()
Thread safety
*************
The KafkaProducer can be used across threads without issue, unlike the
KafkaConsumer which cannot.
While it is possible to use the KafkaConsumer in a thread-local manner,
multiprocessing is recommended.
Compression
***********
kafka-python supports the following compression formats:
- gzip
- LZ4
- Snappy
- Zstandard (zstd)
gzip is supported natively, the others require installing additional libraries.
See https://kafka-python.readthedocs.io/en/master/install.html for more information.
Optimized CRC32 Validation
**************************
Kafka uses CRC32 checksums to validate messages. kafka-python includes a pure
python implementation for compatibility. To improve performance for high-throughput
applications, kafka-python will use `crc32c` for optimized native code if installed.
See https://kafka-python.readthedocs.io/en/master/install.html for installation instructions.
See https://pypi.org/project/crc32c/ for details on the underlying crc32c lib.
Protocol
********
A secondary goal of kafka-python is to provide an easy-to-use protocol layer
for interacting with kafka brokers via the python repl. This is useful for
testing, probing, and general experimentation. The protocol support is
leveraged to enable a KafkaClient.check_version() method that
probes a kafka broker and attempts to identify which version it is running
(0.8.0 to 2.6+).
Owner
- Name: Dana Powers
- Login: dpkp
- Kind: user
- Location: San Francisco, CA
- Repositories: 30
- Profile: https://github.com/dpkp
GitHub Events
Total
- Create event: 165
- Release event: 21
- Issues event: 296
- Watch event: 216
- Delete event: 145
- Issue comment event: 337
- Push event: 427
- Pull request review event: 13
- Pull request review comment event: 12
- Pull request event: 352
- Fork event: 49
Last Year
- Create event: 165
- Release event: 21
- Issues event: 296
- Watch event: 216
- Delete event: 145
- Issue comment event: 337
- Push event: 427
- Pull request review event: 13
- Pull request review comment event: 12
- Pull request event: 352
- Fork event: 49
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Dana Powers | d****s@g****m | 824 |
| Dana Powers | d****s@r****o | 527 |
| Jeff Widman | j****f@j****m | 120 |
| David Arthur | m****h@g****m | 95 |
| Mahendra M | m****m@g****m | 48 |
| Mark Roberts | w****t@g****m | 47 |
| Viktor Shlapakov | v****v@g****m | 42 |
| Omar Ghishan | o****n@r****o | 40 |
| Zack Dever | z****r@p****m | 29 |
| Taras | v****1@g****m | 27 |
| Bruno Renié | b****e@g****m | 26 |
| mrtheb | m****e@g****m | 13 |
| John Anderson | s****k@g****m | 13 |
| William Barnhart | w****t@g****m | 11 |
| Vetoshkin Nikita | n****n@y****u | 10 |
| Enrico Canzonieri | e****i@g****m | 9 |
| Andre Araujo | a****o@g****m | 8 |
| Ivan Pouzyrevsky | s****o@y****u | 8 |
| Mark Roberts | m****s@k****m | 8 |
| Alex Couture-Beil | a****x@m****a | 7 |
| Thomas Dimson | t****n@g****m | 7 |
| Tincu Gabriel | g****i@a****o | 6 |
| Lou Marvin Caraig | l****g@g****m | 5 |
| Matthew L Daniel | m****l@g****m | 5 |
| Tyler Lubeck | t****r@c****m | 5 |
| Mark Roberts | w****t@f****m | 4 |
| Jim Lim | j****m@q****m | 4 |
| Carson Ip | c****5@g****m | 4 |
| Brian Sang | s****i@g****m | 4 |
| Heikki Nousiainen | h****n@a****o | 4 |
| and 195 more... | ||
Committer Domains (Top 20 + Academic)
aiven.io: 5
datadoghq.com: 4
percolate.com: 2
163.com: 2
altlinux.org: 2
shopify.com: 2
linkedin.com: 2
gmx.net: 2
yelp.com: 2
rd.io: 2
yandex-team.ru: 2
bernat.im: 1
amadeus.com: 1
everpcpc.com: 1
blocket.se: 1
adfin.com: 1
met.no: 1
tuta.io: 1
surveymonkey.com: 1
chmd.fr: 1
ieee.org: 1
nyu.edu: 1
ifi.uio.no: 1
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 357
- Total pull requests: 425
- Average time to close issues: about 3 years
- Average time to close pull requests: 7 months
- Total issue authors: 315
- Total pull request authors: 74
- Average comments per issue: 3.0
- Average comments per pull request: 0.88
- Merged pull requests: 275
- Bot issues: 0
- Bot pull requests: 10
Past Year
- Issues: 46
- Pull requests: 302
- Average time to close issues: 4 days
- Average time to close pull requests: about 17 hours
- Issue authors: 41
- Pull request authors: 17
- Average comments per issue: 1.78
- Average comments per pull request: 0.1
- Merged pull requests: 250
- Bot issues: 0
- Bot pull requests: 3
Top Authors
Issue Authors
- jeffwidman (20)
- dpkp (5)
- berrfred (3)
- ghost (2)
- hackaugusto (2)
- millerdev (2)
- indywidualny (2)
- Pyrrha (2)
- jacopofar (2)
- lv123123long (2)
- f-r-kuznetsov (2)
- tvoinarovskyi (2)
- braedon (2)
- isamaru (2)
- wbarnha (2)
Pull Request Authors
- dpkp (275)
- wbarnha (18)
- dependabot[bot] (10)
- Romain-Geissler-1A (5)
- jeffwidman (5)
- emmanuel-ferdman (4)
- hiwakaba (4)
- mattoberle (3)
- hackaugusto (3)
- sunnyakaxd (2)
- degagne (2)
- sibiryakov (2)
- moshez (2)
- ljluestc (2)
- bmassemin (2)
Top Labels
Issue Labels
consumer (22)
producer (10)
bug (6)
admin-client (6)
test coverage (4)
packaging (4)
network protocol (3)
enhancement (3)
help request (3)
critical/stability (2)
needs investigation (2)
help wanted (2)
wontfix (1)
dependencies (1)
duplicate (1)
invalid (1)
sasl (1)
Pull Request Labels
network protocol (14)
dependencies (11)
enhancement (5)
needs investigation (4)
admin-client (4)
documentation (3)
do-not-merge (3)
github_actions (3)
test coverage (2)
producer (2)
wontfix (2)
consumer (2)
bug (1)
packaging (1)
Packages
- Total packages: 7
-
Total downloads:
- pypi 19,581,380 last-month
- Total docker downloads: 2,032,956,638
-
Total dependent packages: 234
(may contain duplicates) -
Total dependent repositories: 4,005
(may contain duplicates) - Total versions: 95
- Total maintainers: 5
pypi.org: kafka-python
Pure Python client for Apache Kafka
- Homepage: https://github.com/dpkp/kafka-python
- Documentation: https://kafka-python.readthedocs.io/
- License: Apache Software License
-
Latest release: 2.2.15
published 8 months ago
Rankings
Docker downloads count: 0.0%
Downloads: 0.1%
Dependent packages count: 0.1%
Dependent repos count: 0.2%
Average: 0.3%
Stargazers count: 0.4%
Forks count: 1.1%
Last synced:
6 months ago
pypi.org: kafka
Pure Python client for Apache Kafka
- Homepage: https://github.com/dpkp/kafka-python
- Documentation: https://kafka.readthedocs.io/
- License: Apache License 2.0
-
Latest release: 1.3.5
published over 8 years ago
Rankings
Stargazers count: 0.4%
Downloads: 0.6%
Docker downloads count: 0.8%
Dependent repos count: 0.8%
Average: 0.8%
Forks count: 1.1%
Dependent packages count: 1.3%
Maintainers (1)
Last synced:
6 months ago
pypi.org: kafka-python3
Pure Python client for Apache Kafka
- Homepage: https://github.com/dpkp/kafka-python
- Documentation: https://kafka-python3.readthedocs.io/
- License: Apache License 2.0
-
Latest release: 3.0.0
published almost 4 years ago
Rankings
Stargazers count: 0.4%
Forks count: 1.1%
Downloads: 1.6%
Average: 2.0%
Dependent repos count: 2.6%
Docker downloads count: 2.8%
Dependent packages count: 3.3%
Maintainers (1)
Last synced:
6 months ago
proxy.golang.org: github.com/dpkp/kafka-python
- Documentation: https://pkg.go.dev/github.com/dpkp/kafka-python#section-documentation
- License: apache-2.0
-
Latest release: v0.9.5
published about 10 years ago
Rankings
Forks count: 0.7%
Stargazers count: 0.9%
Average: 3.7%
Dependent repos count: 4.8%
Dependent packages count: 8.5%
Last synced:
6 months ago
pypi.org: gc-kafka-python
Pure Python client for Apache Kafka
- Homepage: https://github.com/dpkp/kafka-python
- Documentation: https://gc-kafka-python.readthedocs.io/
- License: Apache License 2.0
-
Latest release: 0.9.8
published about 10 years ago
Rankings
Stargazers count: 0.4%
Forks count: 1.1%
Dependent packages count: 7.4%
Average: 8.6%
Dependent repos count: 11.9%
Downloads: 22.4%
Maintainers (1)
Last synced:
6 months ago
pypi.org: ns-kafka-python
Pure Python client for Apache Kafka
- Homepage: https://github.com/dpkp/kafka-python
- Documentation: https://ns-kafka-python.readthedocs.io/
- License: Apache License 2.0
-
Latest release: 1.4.7
published over 5 years ago
Rankings
Stargazers count: 0.4%
Forks count: 1.1%
Dependent packages count: 7.4%
Average: 9.3%
Downloads: 15.5%
Dependent repos count: 22.2%
Maintainers (1)
Last synced:
6 months ago
conda-forge.org: kafka-python
- Homepage: https://github.com/dpkp/kafka-python
- License: Apache-2.0
-
Latest release: 2.0.2
published over 5 years ago
Rankings
Forks count: 4.1%
Stargazers count: 4.8%
Average: 10.8%
Dependent repos count: 14.7%
Dependent packages count: 19.6%
Last synced:
6 months ago
Dependencies
docs/requirements.txt
pypi
- sphinx *
- sphinx_rtd_theme *
requirements-dev.txt
pypi
- Sphinx ==3.2.1
- coveralls ==2.1.2
- crc32c ==2.1
- docker-py ==1.10.6
- flake8 ==3.8.3
- lz4 ==3.1.0
- mock ==4.0.2
- py ==1.9.0
- pylint ==2.6.0
- pytest ==6.0.2
- pytest-cov ==2.10.1
- pytest-mock ==3.3.1
- pytest-pylint ==0.17.0
- python-snappy ==0.5.4
- sphinx-rtd-theme ==0.5.0
- tox ==3.20.0
- xxhash ==2.0.0