SlideShare a Scribd company logo
1 of 33
Download to read offline
@apachepinot | @KishoreBytes
Apache Pinot Case Study
Building distributed analytics systems
using Apache Kafka
@apachepinot | @KishoreBytes
@apachepinot | @KishoreBytes
Pinot @LinkedIn
@apachepinot | @KishoreBytes
70+
Products
Pinot @ LinkedIn
User Facing Analytics
120k+
queries/sec
ms - 1s
latency
@apachepinot | @KishoreBytes
Pinot @ LinkedIn
Business Metrics Analytics
10k+
Metrics
50k+
Dimensions
@apachepinot | @KishoreBytes
Pinot @ LinkedIn
ThirdEye: Anomaly detection and root cause analysis
50+
Teams
100K
Time Series
@apachepinot | @KishoreBytes
Apache Pinot @
Other Companies
2.7k
Github StarsSlack UsersCompanies
400+20+
Community has tripled in the last two quarters
Join our growing community on the Apache Pinot Slack Channel
https://communityinviter.com/apps/apache-pinot/apache-pinot
@apachepinot | @KishoreBytes
User Facing
Applications
Business Facing
Metrics
Anomaly Detection
Time Series
Multiple Use Cases:
One Platform
Kafka
70+
10k
100k
120k
Queries/secEvents/sec
1M+
@apachepinot | @KishoreBytes
Challenges of User facing real-time analytics
Velocity of
ingestion
High
Dimensionality
1000s of QPS
Milliseconds
Latency
Seconds
Freshness
Highly
Available Scalable
Cost
Effective
User-facing
real-time
analytics
system
@apachepinot | @KishoreBytes
Pinot Real-time Ingestion
Deep Dive
@apachepinot | @KishoreBytes
Pinot Architecture
Servers
Brokers
Queries
Scatter Gather
● Servers - Consuming,
indexing, serving
● Brokers - Scatter gather
@apachepinot | @KishoreBytes
Server 1
Deep Store
Pinot Realtime Ingestion Basics
● Kafka Consumer on Pinot Server
● Periodically create “Pinot segment”
● Persist to deep store
● In memory data - queryable
● Continue consumption
@apachepinot | @KishoreBytes
Kafka Consumer Groups
Approach 1
@apachepinot | @KishoreBytes
Kafka Consumer Group based design
● Each consumer consumes
from 1 or more partitions
Server 2Server 1
time
3 partitions
Consumer Group
Kafka
Consumer
Kafka
Consumer
● Periodic checkpointing
● Kafka Rebalancer
Server1 starts
consuming from
0 and 2
Checkpoint 350
Checkpoint 400
seg1 seg2
Kafka
Rebalancer
● Fault tolerant consumption
@apachepinot | @KishoreBytes
Challenges with Capacity Expansion
Server 2S1
Add Server3
Partition 2 moves
to Server 3
Server3 begins consumption from 400time
Server 3
Duplicate Data!
3 partitions
Kafka
Consumer
Kafka
Consumer
Consumer Group
Kafka
Consumer
Checkpoint 350
Checkpoint 400
seg1 seg2
Kafka
Rebalancer
Server1 starts
consuming from
0 and 2
@apachepinot | @KishoreBytes
Deep store
Multiple Consumer Groups
Consumer Group 1
Consumer Group 2
3 partitions
2 replicas
● No control over partitions
assigned to consumer
● No control over checkpointing
● Segment disparity
Queries
Fault tolerant
● Storage inefficient
@apachepinot | @KishoreBytes
Operational Complexity
Queries
Consumer Group 1
Consumer Group 2
3 partitions
2 replicas
● Disable consumer group for
node failure/capacity changes
@apachepinot | @KishoreBytes
Server 4
Scalability limitation
Queries
Consumer Group 1
Consumer Group 2
3 partitions
2 replicas
● Scalability limited by #partitions
Idle
● Cost inefficient
@apachepinot | @KishoreBytes
Single node in a Consumer Group
● Eliminates incorrect results
● Reduced operational complexity
Server 1
Server 2
● Limited by capacity of 1 node
● Storage overhead
● Scalability limitation
Consumer
Group 1
Consumer
Group 2
3 partitions
2 replicas
The only deployment model that worked
@apachepinot | @KishoreBytes
Incorrect
Results
Operational
Complexity
Storage
overhead
Limited
scalability
Expensive
Multi-node
Consumer
Group
Y Y Y Y Y
Single-node
Consumer
Group
Y Y Y
Issues with
Kafka Consumer Group based solution
@apachepinot | @KishoreBytes
Problem 1
Lack of control with Kafka Rebalancer
Solution
Take control of partition assignment
@apachepinot | @KishoreBytes
Problem 2
Segment Disparity due to checkpointing mechanism
Solution
Take control of checkpointing
@apachepinot | @KishoreBytes
Partition Level Consumption
Approach 2
@apachepinot | @KishoreBytes
S1 S3
Partition Level Consumption
Controller
S23 partitions
2 replicas
Partition Server State Start
offset
End
offset
S1
S2
CONSUMING
CONSUMING 20
S3
S1
CONSUMING
CONSUMING 20
S2
S3
CONSUMING
CONSUMING 20
0
1
2
Cluster State
● Single coordinator across all
replicas
● All actions determined by
cluster state
@apachepinot | @KishoreBytes
Deep Store
S1 S3
Partition Level Consumption
Controller
S23 partitions
2 replicas
Partition Server State Start
offset
End
offset
0
S1
S2
CONSUMING
CONSUMING 20
1
S3
S1
CONSUMING
CONSUMING 20
2
S2
S3
CONSUMING
CONSUMING 20
Cluster State
Commit
80
110
110ONLINE
ONLINE
● Only 1 server persists
segment to deep store
● Only 1 copy stored
@apachepinot | @KishoreBytes
Deep Store
S1 S3
Partition Level Consumption
Controller
S23 partitions
2 replicas
Partition Server State Start
offset
End
offset
0
S1
S2 20
1
S3
S1
CONSUMING
CONSUMING 20
2
S2
S3
CONSUMING
CONSUMING 20
Cluster State
110
ONLINE
ONLINE
● All other replicas
○ Download from deep
store
● Segment equivalence
@apachepinot | @KishoreBytes
Deep Store
S1 S3
Partition Level Consumption
Controller
S23 partitions
2 replicas
Partition Server State Start
offset
End
offset
0
S1
S2
ONLINE
ONLINE
20 110
1
S3
S1
CONSUMING
CONSUMING
20
2
S2
S3
CONSUMING
CONSUMING
20
Cluster State
0
S1
S2
CONSUMING
CONSUMING
110
● New segment state created
● Start where previous segment left off
@apachepinot | @KishoreBytes
Deep Store
S1 S3
Partition Level Consumption
Controller
S23 partitions
2 replicas
Partition Server State Start
offset
End
offset
0
S1
S2
ONLINE
ONLINE
20 110
1
S3
S1
ONLINE
ONLINE
20 120
2
S2
S3
ONLINE
ONLINE
20 100
Cluster State
0
S1
S2
CONSUMING
CONSUMING
110
1
S3
S1
CONSUMING
CONSUMING
120
2
S2
S3
CONSUMING
CONSUMING
100
● Each partition independent
of others
@apachepinot | @KishoreBytes
Deep Store
S1 S3
Capacity expansion
Controller
S23 partitions
2 replicas
S4
● Consuming segment - Restart consumption
using offset in cluster state
● Pinot segment - Download from deep store
● Easy to handle changes in
replication/partitions
● No duplicates!
● Cluster state table updated
@apachepinot | @KishoreBytes
S1 S3
Node failures
Controller
S23 partitions
2 replicas
S4
● At least 1 replica still alive
● No complex operations
@apachepinot | @KishoreBytes
S1 S3
Scalability
Controller
S23 partitions
2 replicas
S4
● Easily add nodes
● Segment equivalence =
Smart segment assignment
+ Smart query routing
S6 S5
Completed
Servers
Consuming
Servers
@apachepinot | @KishoreBytes
Incorrect
Results
Operational
Complexity
Storage
overhead
Limited
scalability
Expensive
Multi-node
Consumer
Group
Y Y Y Y Y
Single-node
Consumer
Group
Y Y Y
Partition
Level
Consumers
Summary
@apachepinot | @KishoreBytes
Q&A
pinot.apache.org
@apachepinot

More Related Content

What's hot

Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Databricks
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowDremio Corporation
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumTathastu.ai
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberXiang Fu
 
Performance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL DatabasePerformance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL DatabaseTung Nguyen Thanh
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...Simplilearn
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...StreamNative
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
 

What's hot (20)

Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache Arrow
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
 
Dremio introduction
Dremio introductionDremio introduction
Dremio introduction
 
Performance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL DatabasePerformance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL Database
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 

Similar to Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache Kafka (Neha Pawar, Stealth Mode Startup) Kafka Summit 2020

Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...HostedbyConfluent
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaRicardo Bravo
 
Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Peter Bakas
 
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...Natan Silnitsky
 
MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup Ran Levy
 
Flink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paasFlink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paasMonal Daxini
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Productionconfluent
 
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogicRakuten Group, Inc.
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaShiao-An Yuan
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingApache Apex
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5Peter Lawrey
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQAraf Karsh Hamid
 
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...Natan Silnitsky
 
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 PerformanceTYPO3 CertiFUNcation
 
TYPO3 Performance (T3DD18)
TYPO3 Performance (T3DD18)TYPO3 Performance (T3DD18)
TYPO3 Performance (T3DD18)Marcus Schwemer
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017Monal Daxini
 
HadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewHadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewYafang Chang
 

Similar to Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache Kafka (Neha Pawar, Stealth Mode Startup) Kafka Summit 2020 (20)

Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Amazon Kinesis
Amazon KinesisAmazon Kinesis
Amazon Kinesis
 
Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Keystone - ApacheCon 2016
Keystone - ApacheCon 2016
 
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
 
MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup
 
Flink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paasFlink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paas
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Production
 
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic
[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQ
 
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
 
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance
 
TYPO3 Performance (T3DD18)
TYPO3 Performance (T3DD18)TYPO3 Performance (T3DD18)
TYPO3 Performance (T3DD18)
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
 
History of Apache Pinot
History of Apache Pinot History of Apache Pinot
History of Apache Pinot
 
HadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewHadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop Overview
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 

Recently uploaded (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 

Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache Kafka (Neha Pawar, Stealth Mode Startup) Kafka Summit 2020