Hot updates
- dbt 1.0.0 released
- dbt is gaining popularity
- Great instrument which solves really existing problem
- RedisJSON is out for public preview https://redis.com/blog/redisjson-public-preview-performance-benchmarking/
- Need to have Redis 6.x or later
probably a good point to talk once again that - RedisJSON* is faster than MongoDB and ElasticSearch on direct read, write, and update workloads.
- available in Redis Cloud
or you can always buuild it yourself - Basically a bunch of JSON commands for “native” json experience:
- JSON.SET
- JSON.GET
- JSON.NUMINCRBY
- Client libraries for Go/Node.js/Python/Java/.NET/PHP/Ruby
only Java and Python are official libs
- available in Redis Cloud
- You can index your JSON documents using RediSearch, and you can set it up to update indexes on every write
- Check the benchmarks
- Why community is dissing MongoDB recently? Their SSPL license is to blame
- FerretDB: relevant interesting solution — MongoDB interface over PostgreSQL
- Need to have Redis 6.x or later
- Neo4j 4.4 is out last december
- user impersonation is the main new feature
- All cloud providers have their own graph db
- GCP and AWS provide AuraDB: neo4j managed cloud service
- Amazon Neptune
- Azure CosmosDB
- Neo4j has two main language engines: cypher, gremlin
gremlin is a Java API for Graph DB. Techincally, Gremlin itself is a database engine
…. cypher is a term used to describe freestyle rap in a group setting, which might be something to consider when you search online for cypher tips - Some say cypher looks very Lua-ish
- Neo4j Desktop and a browser interface
additional plugins for easier visualizations/explorations - “I wonder if we could solve this year’s AoC problems with some DB like that”
- O’Relly often gives out their Neo4j book for free
- Why would you ever need a graph database?
Lightning news
- Apache IoTDB 0.12.4
- group by multi level
- Lots of major verndor release updates due to well-known log4j vulnerabilities
Do not forget to update log4j if you didn’t yet - Calcite 1.29.0
- Log4j obviously
- Apache Beam new release
- Minor relase, 3 braking changes
- Last time we mention it here because they release new versions weekly
- LakeFS new releases
- They just never stop:
- performance improvements
- new OpenAPI method
- security check
- They just never stop:
- Apache ORC 1.7.2 released
- It’s just good to know that this format is still alive and is being developed.
- row level filtering in columnar storage format
- now row-level predicates will work on rows (at the reader level)!
Discussion: ETL and Reverse ETL
Dr. Igor Mosyagin
Data Engineer @ Klarna
Igor identifies himself as a pragmatic engineer with strong academic background. A theoretical physicist by training, he eventually assumed he had enough PhDs and left Academia to work with Data-* related things. As of 2022, Igor works as a Data Platform Engineer at Klarna. On top of that, he’s a huge fan of cephalopods, math rock, and quantum mechanics. He also hates baked carrots so much he decided to mention it in this bio
Pasha Finkelshteyn
Developer advocate @ JetBrains
Having 14 years of experience in IT, Pasha went through a fire in water, from technical support to developer, team lead, and data engineer. Now Pasha works as a developer advocate for Data Engineering at JetBrains. He helps develop the Big Data Tools plugin, gives talks on Kotlin and various aspects of data engineering, and work with data. Also, he is the author and maintainer of Kotlin API for Apache Spark.