Experience
Education
Publications
-
Building a Replicated Logging System with Apache Kafka
Very Large Data Base Endowment Inc. (VLDB Endowment)
Apache Kafka is a scalable publish-subscribe messaging system
with its core architecture as a distributed commit log.
It was originally built at LinkedIn as its centralized event
pipelining platform for online data integration tasks. Over
the past years developing and operating Kafka, we extend
its log-structured architecture as a replicated logging backbone
for much wider application scopes in the distributed
environment. In this abstract, we will talk about our…Apache Kafka is a scalable publish-subscribe messaging system
with its core architecture as a distributed commit log.
It was originally built at LinkedIn as its centralized event
pipelining platform for online data integration tasks. Over
the past years developing and operating Kafka, we extend
its log-structured architecture as a replicated logging backbone
for much wider application scopes in the distributed
environment. In this abstract, we will talk about our design
and engineering experience to replicate Kafka logs for various
distributed data-driven systems at LinkedIn, including
source-of-truth data storage and stream processing.Other authorsSee publication -
Building LinkedIn’s Real-time Activity Data Pipeline
Bulletin of the IEEE Computer Society Technical Committee on Data Engineering
-
Building LinkedIn’s Real-time Activity Data Pipeline
Bulletin of the IEEE Computer Society Technical Committee on Data Engineering
-
Kafka: A Distributed Messaging System for Log Processing
NetDB 2011
Log processing has become a critical component of the data pipeline for consumer internet companies. We introduce Kafka, a distributed messaging system that we developed for collecting and delivering high volumes of log data with low latency. Our system incorporates ideas from existing log aggregators and messaging systems, and is suitable for both offline and online message consumption. We made quite a few unconventional yet practical design choices in Kafka to make our system efficient and…
Log processing has become a critical component of the data pipeline for consumer internet companies. We introduce Kafka, a distributed messaging system that we developed for collecting and delivering high volumes of log data with low latency. Our system incorporates ideas from existing log aggregators and messaging systems, and is suitable for both offline and online message consumption. We made quite a few unconventional yet practical design choices in Kafka to make our system efficient and scalable. Our experimental results show that Kafka has superior performance when compared to two popular messaging systems. We have been using Kafka in production for some time and it is processing hundreds of gigabytes of new data each day.
Other authorsSee publication
Projects
Other similar profiles
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top contentOthers named Neha Narkhede
-
Neha Narkhede
B.Tech CSE ’27 | Backend Development | Java | Spring Boot
-
Neha Narkhede
Senior Engineer at Fractal
-
Neha Narkhede
Business Developer Credit Risk Monitoring at ABN AMRO Bank | CSPO
-
Neha Narkhede
Associate - Projects || Automation Testing || Selenium using java || playwright || Api Testing || CI/CD integration
27 others named Neha Narkhede are on LinkedIn
See others named Neha Narkhede