My Blog Posts.

A guide to configuring Hive Metastore to use a MySQL server as the backend RDBMS for metadata storage, while enabling Spark to connect to the Metastore.

Sep 22, 2023

Naveen Kannan

Using Ansible to install Hive on a Spark cluster.

Computing

Clusters

Discussion

Ansible

Hive

Spark

A detailed view of Ansible playbooks with a highly relevant example.

Sep 4, 2023

Naveen Kannan

Installing and configuring Hadoop and Spark on a 4 node cluster.

Computing

Clusters

Discussion

Hadoop

Spark

HDFS

YARN

A guide to installing Hadoop and Spark on a 4 node cluster, while configuring and setting up HDFS, YARN and MapReduce.

Aug 21, 2023

Naveen Kannan

Using Ansible to remotely configure a cluster.

Computing

Clusters

Discussion

Ansible

Docker

Containers

Using a containerized instance of Ansible to remotely connect to a cluster and perform a simple ping task to confirm connection.

Jun 24, 2023

Naveen Kannan

Docker, Singularity, and HPC.

Computing

HPC

Containers

Discussion

Docker

Singularity

A brief rundown of Docker and Singularity, and their relevance in HPC environments.

Jun 17, 2023

Naveen Kannan

SLURM and HPC.

Computing

HPC

Discussion

Clusters

An introduction to SLURM in the context of HPC clusters.

Jun 15, 2023

Naveen Kannan

p values, Statistical Significance, and the magic number.

Statistics

Discussion

A brief exploration on p-values.

Mar 22, 2023

Naveen Kannan

Categories

Introduction to PXE boot servers.

Integrating HDFS and PostgreSQL through Apache Spark.

Mamba implementation in Scientific Pipelines.

Moving Docker’s Data directory to another location.

Installing and configuring the HIVE metastore with a MySQL backend.

Using Ansible to install Hive on a Spark cluster.

Installing and configuring Hadoop and Spark on a 4 node cluster.

Using Ansible to remotely configure a cluster.

Docker, Singularity, and HPC.

SLURM and HPC.

p values, Statistical Significance, and the magic number.