PinnedBigDataEnthusiastInternals of Apache IcebergIn this blog we are going to explore the architectural components of the Apache Iceberg.4 min read·Jul 2, 2023----
BigDataEnthusiastPolars Dataframe — SQL InterfaceWhile dealing with polars dataframes in Python, instead of using dataframes APIs (eg. fiter, select, join etc.) for data transformation…4 min read·Apr 15, 2024----
BigDataEnthusiastApache Iceberg — Hidden PartitioningIn this blog we will explore “Hidden Partitioning” concept in Apache Iceberg.5 min read·Mar 30, 2024----
BigDataEnthusiastMinIO — High Performance Object StorageMinIO is a high-performance, kubernetes native object storage.4 min read·Aug 20, 2023----
BigDataEnthusiastApache Spark — Log Parsing using regexp_extractApache Spark built-in function regexp_extract that takes input as an column object, regex expression as string and group index & extract a…3 min read·Aug 19, 2023----
BigDataEnthusiastSpark Scala — RDD zipWithIndexSuppose you have a file with unwanted lines in its header, which you don’t wanted to process.2 min read·Aug 17, 2023----
BigDataEnthusiastApache Spark: Explode FunctionApache Spark built-in function that takes input as an column object (array or map type) and returns a new row for each element in the given…6 min read·Aug 15, 2023----
BigDataEnthusiastApache Iceberg — Insert OverwriteINSERT OVERWRITE can replace/overwrite the data in iceberg table, depending on configurations set and how we are using it.3 min read·Jul 9, 2023----
BigDataEnthusiastApache Iceberg Table — v2 Format — merge-on-readIn previous blog we have already seen different types of iceberg table format & write mode supported. Please refer below link.5 min read·Jul 8, 2023----
BigDataEnthusiastApache Iceberg Table — v1 Format — copy-on-writeIn previous blog we have already seen different types of iceberg table format & write mode supported. Please refer below link.4 min read·Jul 8, 2023----