<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Big-Data on Datro - From Data to Action | Tailored Web Apps. Real Business Value.</title><link>https://datro.co.za/tags/big-data/</link><description>Recent content in Big-Data on Datro - From Data to Action | Tailored Web Apps. Real Business Value.</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Fri, 26 Sep 2025 10:37:13 +0200</lastBuildDate><atom:link href="https://datro.co.za/tags/big-data/index.xml" rel="self" type="application/rss+xml"/><item><title>Apache Iceberg - Table Format for Data Lakes</title><link>https://datro.co.za/tech/iceberg/</link><pubDate>Wed, 06 Aug 2025 00:00:00 +0000</pubDate><guid>https://datro.co.za/tech/iceberg/</guid><description>&lt;h1 id="apache-iceberg-table-format-for-data-lakes">Apache Iceberg: Table Format for Data Lakes&lt;/h1>
&lt;h2 id="why-we-choose-apache-iceberg">Why We Choose Apache Iceberg&lt;/h2>
&lt;p>Apache Iceberg represents the future of data lake management - providing ACID compliance, schema evolution, and time travel capabilities that transform how we store, query, and manage large-scale data. Here&amp;rsquo;s why it&amp;rsquo;s the foundation of our modern data architecture.&lt;/p>
&lt;h3 id="acid-compliance-for-data-lakes">&lt;strong>ACID Compliance for Data Lakes&lt;/strong>&lt;/h3>
&lt;p>Iceberg brings enterprise-grade reliability to data lakes:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>ACID Transactions&lt;/strong>: Full atomicity, consistency, isolation, and durability&lt;/li>
&lt;li>&lt;strong>Schema Evolution&lt;/strong>: Safe schema changes without data corruption&lt;/li>
&lt;li>&lt;strong>Time Travel&lt;/strong>: Query data at any point in time&lt;/li>
&lt;li>&lt;strong>Hidden Partitioning&lt;/strong>: Logical partitioning independent of physical storage&lt;/li>
&lt;li>&lt;strong>Metadata Management&lt;/strong>: Efficient metadata handling for large datasets&lt;/li>
&lt;/ul>
&lt;h3 id="performance-and-scalability">&lt;strong>Performance and Scalability&lt;/strong>&lt;/h3>
&lt;p>Iceberg delivers exceptional performance characteristics:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Partition Pruning&lt;/strong>: Intelligent partition elimination for faster queries&lt;/li>
&lt;li>&lt;strong>Column Projection&lt;/strong>: Read only the columns you need&lt;/li>
&lt;li>&lt;strong>File Skipping&lt;/strong>: Skip irrelevant files based on metadata&lt;/li>
&lt;li>&lt;strong>Compaction&lt;/strong>: Automatic file optimization and cleanup&lt;/li>
&lt;li>&lt;strong>Caching&lt;/strong>: Efficient metadata caching for repeated queries&lt;/li>
&lt;/ul>
&lt;h3 id="key-benefits-for-our-clients">&lt;strong>Key Benefits for Our Clients&lt;/strong>&lt;/h3>
&lt;h4 id="1-data-reliability">1. &lt;strong>Data Reliability&lt;/strong>&lt;/h4>
&lt;p>ACID compliance ensures your data is always consistent and recoverable, even in distributed environments.&lt;/p></description></item></channel></rss>