Mervin Praison

Custom Instructions Prompt

Post author By praison
Post date October 16, 2023

Act as Professor Synapse ❤️, a conductor of expert agents. Your job is to support me in accomplishing my goals by finding alignment with me, then calling upon an expert agent perfectly suited to the task by initialising:

Synapse_CoR = “[emoji]: I am an expert in [role&domain]. I know [context]. I will reason step-by-step to determine the best course of action to achieve [goal]. I can use [tools] and [relevant frameworks] to help in this process.

I will help you accomplish your goal by following these steps:
[reasoned steps]

My task ends when [completion].

[first step, question]”

Instructions:
1. ❤️ gather context, relevant information and clarify my goals by asking questions
2. Once confirmed, initialize Synapse_CoR
3. ❤️ and ${emoji} support me until goal is complete

Commands:
/start=❤️,introduce and begin with step one
/ts=❤️,summon (Synapse_CoR*3) town square debate
/save❤️, restate goal, summarize progress, reason next step

Personality:
-curious, inquisitive, encouraging
-use emojis to express yourself

Rules:
-End every output with a question or reasoned next step
-Start every output with ❤️: or ${emoji}: to indicate who is speaking.
-Organize every output with ❤️ aligning on my request, followed by ${emoji} response
-❤️, recommend save after each task is completed

Embedding Impact Across Model Configurations

Post author By praison
Post date October 11, 2023

Fine-tuning a model like ChatGPT with embeddings involves a few steps. Here’s a simplified outline of the process:

Embedding Generation:

Use an embedding model (e.g., BERT, Word2Vec) to generate embeddings for your data.

python generate_embeddings.py --data_file <data_file> --embedding_file <embedding_file>

Prepare Data:

Combine embeddings with your data in a format suitable for training.

python prepare_data.py --embedding_file <embedding_file> --output_file <training_data>

Fine-tuning:

Use the prepared data to fine-tune ChatGPT.

python run_finetuning.py --model_name_or_path gpt-3.5-turbo --train_file <training_data> --output_dir <output_dir>

Evaluation:

Evaluate the fine-tuned model on a separate dataset to check the performance.

python evaluate.py --model_name_or_path <output_dir> --eval_file <eval_data>

Example

Below are simplified example scripts and data file content to give you an idea of how this process might be structured.

Example Content of data_file:

question: What is the capital of France? | answer: Paris
question: What is the capital of Germany? | answer: Berlin
...

generate_embeddings.py:

from transformers import BertTokenizer, BertModel
import torch

def generate_embeddings(data_file, embedding_file):
    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
    model = BertModel.from_pretrained('bert-base-uncased')

    with open(data_file, 'r') as f:
        data = f.readlines()

    embeddings = []
    for line in data:
        inputs = tokenizer(line, return_tensors='pt', truncation=True)
        with torch.no_grad():
            outputs = model(**inputs)
        embeddings.append(outputs.last_hidden_state.mean(dim=1).squeeze().tolist())

    with open(embedding_file, 'w') as f:
        for embedding in embeddings:
            f.write(','.join(map(str, embedding)) + '\n')

if __name__ == "__main__":
    generate_embeddings('<data_file>', '<embedding_file>')

prepare_data.py:

def prepare_data(embedding_file, training_data):
    with open(embedding_file, 'r') as ef, open(training_data, 'w') as tf:
        for line in ef:
            embedding = line.strip()
            tf.write(embedding + '\n')

if __name__ == "__main__":
    prepare_data('<embedding_file>', '<training_data>')

run_finetuning.py:

from transformers import GPT2Tokenizer, GPT2LMHeadModel, TextDataset, DataCollatorForLanguageModeling, Trainer, TrainingArguments

def fine_tune(training_data, output_dir):
    tokenizer = GPT2Tokenizer.from_pretrained('gpt-2')
    model = GPT2LMHeadModel.from_pretrained('gpt-2')

    train_dataset = TextDataset(
        tokenizer=tokenizer,
        file_path=training_data,
        block_size=128,
    )

    data_collator = DataCollatorForLanguageModeling(
        tokenizer=tokenizer,
        mlm=False,
    )

    training_args = TrainingArguments(
        output_dir=output_dir,
        overwrite_output_dir=True,
        num_train_epochs=1,
        per_device_train_batch_size=32,
        save_steps=10_000,
        save_total_limit=2,
    )

    trainer = Trainer(
        model=model,
        args=training_args,
        data_collator=data_collator,
        train_dataset=train_dataset,
    )

    trainer.train()

if __name__ == "__main__":
    fine_tune('<training_data>', '<output_dir>')

evaluate.py:

from transformers import GPT2Tokenizer, GPT2LMHeadModel, pipeline
import torch

def evaluate(model_path, eval_file):
    # Load the fine-tuned model and tokenizer
    model = GPT2LMHeadModel.from_pretrained(model_path)
    tokenizer = GPT2Tokenizer.from_pretrained(model_path)
    
    # Create a text generation pipeline
    text_generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
    
    # Read the evaluation data
    with open(eval_file, 'r') as f:
        eval_data = f.readlines()
    
    # Iterate through the evaluation data and generate responses
    for i, prompt in enumerate(eval_data):
        generated_text = text_generator(prompt, max_length=150, do_sample=True, temperature=0.7)
        print(f'{i+1}. Prompt: {prompt.strip()}\n   Generated: {generated_text[0]["generated_text"]}\n')
    
    # Optionally, compute some evaluation metrics (e.g., BLEU, perplexity)
    # ...

if __name__ == "__main__":
    evaluate('<model_path>', '<eval_file>')

Replace placeholders like <data_file>, <embedding_file>, <training_data>, and <output_dir> with your actual file paths. Note: these scripts are simplified examples and may not work out of the box for your specific scenario.

Based on the gathered data, here is a detailed comparison table that incorporates the differences between a Normal Model, Directly Fine-tuned Model, Fine-tuned with Embeddings, and Not Fine-tuned but with Embeddings:

Aspect	Normal Model	Directly Fine-tuned Model	Fine-tuned with Embeddings	Not Fine-tuned but with Embeddings
Definition	A pre-trained model used as-is.	Modifying a pre-trained model’s weights on relevant data.	Using pre-generated embeddings to adjust a model while also fine-tuning.	Using pre-generated embeddings without fine-tuning.
Data Requirement	None.	Requires original or relevant data.	Requires only embeddings, not original data.	Requires only embeddings, not original data.
Computational Cost	None.	Higher due to backpropagation through the entire model.	Lower as embeddings are pre-computed; some costs for fine-tuning.	Lower as embeddings are pre-computed.
Model Architecture	Same architecture as pre-trained model.	Same architecture as pre-trained model.	Can use a different architecture optimized for the task.	Can use a different architecture optimized for the task.
Task Adaptability	May lack domain-specific knowledge.	Model learns specifics of new task.	Model learns from embeddings; some task specifics through fine-tuning.	Model learns from embeddings, may not capture all task specifics.
Transferability	Dependent on the pre-trained model.	Better transfer to similar tasks due to direct learning.	Transfer might be less effective as finer nuances might be lost in embeddings.	Transfer might be less effective as finer nuances might be lost in embeddings.
Storage Requirements	Requires storing the original model.	Requires storing the entire fine-tuned model.	Requires storing embeddings and the fine-tuning model separately.	Requires storing embeddings separately.
Ease of Implementation	Easy if the model fits the task.	May require careful setup to avoid catastrophic forgetting.	Easier to set up, as embeddings can be used with various models; some setup for fine-tuning.	Easier to set up, as embeddings can be used with various models.
Training Time	None.	Longer due to backpropagation.	Shorter compared to direct fine-tuning.	None.
Integration & Accuracy	Base accuracy depends on original training.	Improved accuracy for domain-specific tasks due to fine-tuning on relevant data	Can achieve high accuracy by leveraging both fine-tuning and embeddings; exact accuracy may vary	Embeddings provide domain-specific insights; exact impact on accuracy may vary
Learning New Information	Limited to original training data.	Learns new information through fine-tuning on new data	Can learn new information through fine-tuning and embeddings	Embeddings can provide new domain-specific information
Question Answering (Q&A) Tasks	May lack domain-specific knowledge.	Improved accuracy in Q&A tasks due to fine-tuning	Potential for improved accuracy in Q&A tasks through fine-tuning and embeddings; exact impact may vary	Embeddings may improve accuracy in Q&A tasks by providing domain-specific information; exact impact may vary

The accuracy between these configurations may vary based on the specific task and data. Fine-tuning directly and using embeddings are both valid strategies, but their suitability may vary depending on the specific requirements and constraints of your project.

AWS

AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity

Post author By praison
Post date October 9, 2023

Introduction

If you’re working with AWS and Kubernetes, you might encounter an error like this:

[event: ingress my-ingress] (combined from similar events): Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: xxxxxx-yyyyy-zzzzz-aaaa-bbbbb

This blog post aims to guide you through the causes, debugging steps, and the ultimate fix for this issue.

Causes

Incorrect IAM Role: The IAM role assumed by the AWS Load Balancer Controller may not have the correct permissions.
Incorrect Trust Relationship: The trust relationship for the IAM role may not be configured correctly.
AWS STS Access Denied: AWS Security Token Service (STS) is denying the AssumeRoleWithWebIdentity request.

Debugging Steps

Before diving into the fix, it’s essential to understand what’s wrong. Here are some commands to help you debug:

Check IAM Role
bash aws iam get-role –role-name <Role-Name>
Check Trust Relationship
bash aws iam get-role-policy –role-name <Role-Name> –policy-name <Trust-Policy-Name>
Check STS Assume Role
bash aws sts assume-role-with-web-identity –role-arn <Role-ARN> –role-session-name <Session-Name> –web-identity-token <Token>

The Fix: Resetting AWS Load Balancer Controller

The fix involves resetting the AWS Load Balancer Controller. Here’s a Makefile target named alb-reset that automates this process:

alb-reset:
    # Uninstall the existing Helm release
    $(HELM_EXEC) uninstall aws-load-balancer-controller --namespace kube-system

    # Re-add the Helm repo and update
    $(HELM_EXEC) repo add eks https://aws.github.io/eks-charts
    $(HELM_EXEC) repo update

    # Reinstall the Helm chart
    $(HELM_EXEC) install aws-load-balancer-controller eks/aws-load-balancer-controller \
        --namespace kube-system \
        --set clusterName=$(CLUSTER_NAME) \
        --set serviceAccount.create=true \
        --set serviceAccount.name=aws-load-balancer-controller \
        --create-namespace

    # Check Helm release status
    $(HELM_EXEC) status aws-load-balancer-controller --namespace kube-system

    # Check if AWS Load Balancer Controller is running
    kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-load-balancer-controller

Conclusion

Errors related to IAM roles and permissions can be tricky to debug. However, with the right steps and a handy Makefile target, you can resolve the issue and get your AWS Load Balancer Controller back up and running.

Machine Learning

End-to-End Metaflow ML Flow

Post author By praison
Post date October 3, 2023

from metaflow import FlowSpec, step, card
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from datetime import datetime

class MLFlow(FlowSpec):

    @card
    @step
    def start(self):
        """
        Load dataset and split it.
        """
        self.timestamp = datetime.utcnow()
        iris = load_iris()
        X, y = iris.data, iris.target
        self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=0.2)
        self.next(self.train_model)

    @card
    @step
    def train_model(self):
        """
        Train a RandomForest model.
        """
        self.timestamp = datetime.utcnow()
        self.model = RandomForestClassifier()
        self.model.fit(self.X_train, self.y_train)
        self.next(self.evaluate_model)

    @card
    @step
    def evaluate_model(self):
        """
        Evaluate the model.
        """
        self.timestamp = datetime.utcnow()
        self.score = self.model.score(self.X_test, self.y_test)
        self.next(self.end)

    @step
    def end(self):
        """
        End the flow.
        """
        print("Flow is complete.")

if __name__ == '__main__':
    MLFlow()

python <your_script_name>.py run

Scale, Resource allocate and save the model

from metaflow import FlowSpec, step, card, resources, Parameter, S3
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from datetime import datetime
import pickle

class MLFlow(FlowSpec):

    # Parameter for scaling
    scale = Parameter('scale', default=True)

    @card
    @step
    def start(self):
        self.timestamp = datetime.utcnow()
        iris = load_iris()
        X, y = iris.data, iris.target
        self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=0.2)
        self.next(self.preprocess)

    @card
    @step
    def preprocess(self):
        self.timestamp = datetime.utcnow()
        if self.scale:
            scaler = StandardScaler()
            self.X_train = scaler.fit_transform(self.X_train)
            self.X_test = scaler.transform(self.X_test)
        self.next(self.train_model)

    @resources(memory=4000, cpu=2)
    @card
    @step
    def train_model(self):
        self.timestamp = datetime.utcnow()
        self.model = RandomForestClassifier()
        self.model.fit(self.X_train, self.y_train)
        self.next(self.evaluate_model)

    @card
    @step
    def evaluate_model(self):
        self.timestamp = datetime.utcnow()
        self.score = self.model.score(self.X_test, self.y_test)
        self.next(self.save_model)

    @resources(memory=4000, cpu=2)
    @card
    @step
    def save_model(self):
        self.timestamp = datetime.utcnow()
        with S3(run=self) as s3:
            with s3.open("s3://my-bucket/my-model.pkl", "wb") as f:
                pickle.dump(self.model, f)
        self.next(self.end)

    @step
    def end(self):
        print("Flow is complete.")

if __name__ == '__main__':
    MLFlow()

python <your_script_name>.py run --scale=True

requirements.txt

metaflow
scikit-learn
boto3

Dashboard

python3 <your_script_name>.py card view start

Kubernetes

Migrating GCP GKE to AWS EKS

Post author By praison
Post date September 28, 2023

Create Google Cloud Build Trigger (GCP)

First, make sure you’ve installed and initialized the gcloud CLI tool.
Create a build trigger:

gcloud builds triggers create github \
  --repo-name=<YOUR_GITHUB_REPO_NAME> \
  --repo-owner=<YOUR_GITHUB_USERNAME> \
  --branch-pattern="^master$" \
  --build-config=<PATH_TO_YOUR_CLoudbuild.yaml>

Replace placeholders like <YOUR_GITHUB_REPO_NAME>, <YOUR_GITHUB_USERNAME>, and <PATH_TO_YOUR_CLOUDBUILD.yaml> with your specific values.

Create AWS CodeBuild Project

Install and configure the aws CLI tool.
Create a CodeBuild project using a JSON build specification:

aws codebuild create-project \
    --name <PROJECT_NAME> \
    --source "type=GITHUB,location=https://github.com/<GITHUB_USERNAME>/<GITHUB_REPO>.git" \
    --artifacts "type=S3,location=<S3_BUCKET_NAME>" \
    --environment "type=LINUX_CONTAINER,computeType=BUILD_GENERAL1_SMALL,image=aws/codebuild/standard:5.0" \
    --service-role <IAM_ROLE_ARN> \
    --buildspec <PATH_TO_YOUR_buildspec.yml>

Replace placeholders like <PROJECT_NAME>, <GITHUB_USERNAME>, <GITHUB_REPO>, <S3_BUCKET_NAME>, <IAM_ROLE_ARN>, and <PATH_TO_YOUR_buildspec.yml> with your specific values.

Run these commands in your terminal to create the respective build setups.

Dockerfile for GCP (Utilizes Google Storage)

# Using base image
FROM python:3.8

# Set environment variables for GCP
ENV GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json

# Install gsutil
RUN apt-get update && apt-get install -y google-cloud-sdk

# Copy files and use gsutil
COPY . .
RUN gsutil cp gs://<your-gcp-bucket>/some-file /some-directory/

Dockerfile for AWS (Utilizes S3)

# Using base image
FROM python:3.8

# Set environment variables for AWS
ENV AWS_ACCESS_KEY_ID=<your-access-key>
ENV AWS_SECRET_ACCESS_KEY=<your-secret-key>

# Install AWS CLI
RUN apt-get update && apt-get install -y awscli

# Copy files and use AWS S3 commands
COPY . .
RUN aws s3 cp s3://<your-aws-bucket>/some-file /some-directory/

Helm Chart Deployment Example for GCP with GCR Image

Here’s a short Helm chart for deploying a Kubernetes Deployment that uses an image from Google Container Registry (GCR).

# gcp-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gcp-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gcp-app
  template:
    metadata:
      labels:
        app: gcp-app
    spec:
      containers:
      - name: gcp-container
        image: gcr.io/<PROJECT_ID>/<IMAGE_NAME>:<TAG>

Helm Chart Deployment Example for AWS with ECR Image

Here’s a short Helm chart for deploying a Kubernetes Deployment that uses an image from Amazon Elastic Container Registry (ECR).

# aws-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: aws-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: aws-app
  template:
    metadata:
      labels:
        app: aws-app
    spec:
      containers:
      - name: aws-container
        image: <ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com/<REPO_NAME>:<TAG>

Replace placeholders like <PROJECT_ID>, <IMAGE_NAME>, <TAG>, <ACCOUNT_ID>, <REGION>, and <REPO_NAME> with your specific values.

To deploy these, you’d typically run helm install or helm upgrade with these files as input.

Helm Chart for Ingress in GCP

Manual SSL

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: "your-static-ip"
    kubernetes.io/ingress.class: "gce"
spec:
  tls:
  - secretName: ssl-cert
  backend:
    service:
      name: my-service
      port:
        number: 80

Auto Managed SSL

apiVersion: networking.gke.io/v1
kind: ManagedCertificate
metadata:
  name: managed-cert
spec:
  domains:
    - example.com
    - dev.example.com

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: "your-static-ip"
    kubernetes.io/ingress.class: "gce"
    networking.gke.io/managed-certificates: managed-cert
spec:
  backend:
    service:
      name: my-service
      port:
        number: 80

Helm Chart for Ingress in AWS

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    kubernetes.io/ingress.class: "alb"
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/certificate-arn: "your-aws-certificate-arn"
spec:
  tls:
  - hosts:
    - "your.domain.com"
    secretName: aws-cert
  rules:
    ...

Helm Chart with Standard Storage in GCP

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Code for Creating and Assigning Storage in AWS (PV and PVC)

aws ec2 create-volume \
  --availability-zone us-west-2a \
  --size 20 \
  --volume-type gp2

# Assign the volume id to a variable
VOLUME_ID="vol-XXXXXXXXXXXXXXXXX"

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-aws-pv
spec:
  capacity:
    storage: 20Gi
  volumeID: ${VOLUME_ID}
  storageClassName: gp2
  accessModes:
    - ReadWriteOnce

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-aws-pvc
spec:
  storageClassName: gp2
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi

Terraform code to assign GCP Global Load Balancer IP address to A Record

provider "google" {
  credentials = file("<PATH_TO_YOUR_SERVICE_ACCOUNT_JSON>")
  project     = "<PROJECT_ID>"
  region      = "us-central1"
}

resource "google_dns_managed_zone" "my_zone" {
  name     = "my-zone"
  dns_name = "example.com."
}

resource "google_dns_record_set" "my_a_record" {
  name         = "sub.example.com."
  type         = "A"
  ttl          = 300
  managed_zone = google_dns_managed_zone.my_zone.name
  rrdatas      = [google_compute_global_address.my_global_address.address]
}

resource "google_compute_global_address" "my_global_address" {
  name = "my-global-address"
}

Replace <PATH_TO_YOUR_SERVICE_ACCOUNT_JSON> with the path to your Google Cloud service account JSON file, and <PROJECT_ID> with your Google Cloud Project ID.

Terraform code to assign AWS Route53 to Load Balancer (A Record)

provider "aws" {
  region = "us-west-2"
}

resource "aws_route53_zone" "my_zone" {
  name = "example.com."
}

resource "aws_route53_record" "my_a_record" {
  zone_id = aws_route53_zone.my_zone.zone_id
  name    = "sub.example.com"
  type    = "A"

  alias {
    name                   = aws_lb.my_lb.dns_name
    zone_id                = aws_lb.my_lb.zone_id
    evaluate_target_health = false
  }
}

resource "aws_lb" "my_lb" {
  name               = "my-lb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.my_group.id]
  subnets            = [aws_subnet.my_subnet.id]
  enable_deletion_protection = false
}

AWS Terraform Code for ALB and EKS with Subnets

provider "aws" {
  region = "us-west-2"
}

# Define VPC
resource "aws_vpc" "my_vpc" {
  cidr_block = "10.0.0.0/16"
}

# Create two public subnets
resource "aws_subnet" "my_subnet1" {
  vpc_id     = aws_vpc.my_vpc.id
  cidr_block = "10.0.1.0/24"
  map_public_ip_on_launch = true  # Public Subnet
}

resource "aws_subnet" "my_subnet2" {
  vpc_id     = aws_vpc.my_vpc.id
  cidr_block = "10.0.2.0/24"
  map_public_ip_on_launch = true  # Public Subnet
}

# ALB tied to public subnets
resource "aws_lb" "my_lb" {
  name               = "my-lb"
  internal           = false
  load_balancer_type = "application"
  subnets            = [aws_subnet.my_subnet1.id, aws_subnet.my_subnet2.id]
  enable_deletion_protection = false
}

# EKS Cluster
resource "aws_eks_cluster" "this" {
  name     = "my-cluster"
  role_arn = aws_iam_role.eks_cluster.arn

  vpc_config {
    subnet_ids = [aws_subnet.my_subnet1.id, aws_subnet.my_subnet2.id]
  }
}

resource "aws_iam_role" "eks_cluster" {
  name = "eks-cluster"

  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Action = "sts:AssumeRole",
        Effect = "Allow",
        Principal = {
          Service = "eks.amazonaws.com"
        }
      }
    ]
  })
}

GCP Terraform Code for Global Load Balancer and GKE with Subnets

provider "google" {
  credentials = file("<PATH_TO_YOUR_SERVICE_ACCOUNT_JSON>")
  project     = "<PROJECT_ID>"
  region      = "us-central1"
}

# Create VPC
resource "google_compute_network" "my_network" {
  name = "my-network"
}

# Create public subnet
resource "google_compute_subnetwork" "my_public_subnet" {
  name          = "my-public-subnet"
  ip_cidr_range = "10.0.1.0/24"
  network       = google_compute_network.my_network.self_link
  region        = "us-central1"  # Public Subnet
}

# Create Global HTTP Load Balancer components
resource "google_compute_global_forwarding_rule" "global_forwarding_rule" {
  name       = "my-global-lb-forwarding-rule"
  target     = google_compute_target_http_proxy.default.self_link
  port_range = "80"
  ip_address = google_compute_global_address.default.address
}

resource "google_compute_global_address" "default" {
  name = "my-global-ip"
}

resource "google_compute_target_http_proxy" "default" {
  name  = "my-http-proxy"
  url_map = google_compute_url_map.default.self_link
}

resource "google_compute_url_map" "default" {
  name        = "my-url-map"
  default_service = google_compute_backend_service.default.self_link
}

resource "google_compute_backend_service" "default" {
  name        = "my-backend-service"
  port_name   = "http"
  protocol    = "HTTP"
  timeout_sec = 10

  backend {
    group = google_compute_instance_group_manager.default.instance_group
  }
}

resource "google_compute_instance_group_manager" "default" {
  name = "my-instance-group"
  base_instance_name = "my-instance"
  instance_template = google_compute_instance_template.default.self_link
  zone = "us-central1-a"
}

resource "google_compute_instance_template" "default" {
  name_prefix  = "my-instance-template-"
  machine_type = "g1-small"
  tags         = ["http-server"]

  network_interface {
    network = google_compute_network.my_network.name
  }
}

# GKE Cluster
resource "google_container_cluster" "primary" {
  name     = "my-gke-cluster"
  location = "us-central1"
  network    = google_compute_network.my_network.name
  subnetwork = google_compute_subnetwork.my_public_subnet.name  # Public Subnet

  initial_node_count = 3
}

Public and Private Subnets in Load Balancers

AWS: You can associate both public and private subnets with an ALB. However, the ALB itself is either internal or internet-facing, not both. You would typically use an internet-facing ALB with public subnets and an internal ALB with private subnets.
GCP: Google Cloud Load Balancers operate differently than AWS ALBs. The Global Load Balancer doesn’t directly associate with subnets; rather, it directs traffic to instance groups, which can be in different regions and subnets. You can have backend services in both public and private subnets.

Automation

Jumpcutter and Auto-Editor Commands

Post author By praison
Post date September 28, 2023

Jumpcutter

# Clone repository
git clone https://github.com/carykh/jumpcutter.git

# Move into directory
cd jumpcutter

# Install Requirements
pip install -r requirements.txt

# Basic Usage
python jumpcutter.py --input_file <input_video_file>

# Specify silence speed
python jumpcutter.py --input_file <input_video_file> --silent_speed 1

# Specify sounded speed
python jumpcutter.py --input_file <input_video_file> --sounded_speed 1.2

# Debug (No specific debug flag)
python jumpcutter.py --input_file <input_video_file> --frame_margin 1

To find the value for <input_video_file>:

ls *.mp4  # Lists all mp4 files in the current directory

Auto-Editor

# Install auto-editor
pip install auto-editor

# Basic Usage
auto-editor <input_video_file>

# Specify output format
auto-editor <input_video_file> --format mp4

# Change speed factor
auto-editor <input_video_file> --fast_speed 1.5

# Specify audio language
auto-editor <input_video_file> --audio_language eng

# Debug
auto-editor <input_video_file> --debug

To find the value for <input_video_file>:

ls *.mp4  # Lists all mp4 files in the current directory

Commands for Debugging

# For auto-editor
auto-editor <input_video_file> --debug

# For jumpcutter (No specific debug flag)
python jumpcutter.py --input_file <input_video_file> --frame_margin 1

Replace <input_video_file> with your video file name.

Kubernetes

Adding MySQL Dashboards to Grafana using Helm

Post author By praison
Post date September 27, 2023

mysql-exporter-values.yaml

mysql:
  db: ""
  host: "172.1.0.6"
  pass: "IjTdP4ri79"
  port: 3306
  protocol: ""
  user: "root"

serviceMonitor:
  # enabled should be set to true to enable prometheus-operator discovery of this service
  enabled: true
  additionalLabels:
    release: grafana

Install Kube-Prometheus-Stack on Cluster

Add Prometheus Helm Repo
bash helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
Update Helm Repo
bash helm repo update
Create Namespace ‘metrics’
bash kubectl create ns metrics
Set Current Context to ‘metrics’ Namespace
bash kubectl config set-context --current --namespace=metrics
Deploy Grafana & Kube-Prometheus-Stack
bash helm upgrade --install grafana prometheus-community/kube-prometheus-stack

Install Prometheus MySQL Exporter

Add Prometheus Helm Repo (if not added)
bash helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
Update Helm Repo
bash helm repo update
Deploy MySQL Exporter
bash helm upgrade --install mysql-exporter prometheus-community/prometheus-mysql-exporter -f mysql-exporter-values.yaml

Install Prometheus MySQL Exporter Dashboard

Grafana Dashboard IDs for MySQL Monitoring
- 14057
- 7362

Combined Code Block

# Add Prometheus Helm Repo and Update
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Create Namespace 'metrics' and Set Context
kubectl create ns metrics
kubectl config set-context --current --namespace=metrics

# Deploy Grafana & Kube-Prometheus-Stack
helm upgrade --install grafana prometheus-community/kube-prometheus-stack

# Deploy MySQL Exporter
helm upgrade --install mysql-exporter prometheus-community/prometheus-mysql-exporter -f mysql-exporter-values.yaml

Kubernetes

Enable HTTPS on AWS ALB with Kubernetes

Post author By praison
Post date September 14, 2023

Step-01: Register a Domain in Route53 (if not exists)

AWS Management Console -> Services -> Route53 -> Registered Domains
Click on “Register Domain”
Enter domain: example.com -> Click “Check”
“Add to cart” -> “Continue”
Fill Contact Details -> “Continue”
Enable “Automatic Renewal”
Accept Terms -> “Complete Order”

Step-02: Create SSL Certificate in Certificate Manager

Services -> Certificate Manager -> “Create a Certificate”
“Request a Certificate” -> Choose “Request a public certificate”
Add domain: *.example.com
Select “DNS Validation” -> “Confirm & Request”
“Create record in Route 53”
Wait 5-10 mins -> Check Validation Status

Step-03: Update Ingress Manifest with SSL Annotations

07-ALB-Ingress-SSL.yml

# SSL Annotations
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
alb.ingress.kubernetes.io/certificate-arn: <YOUR_CERTIFICATE_ARN>

To get <YOUR_CERTIFICATE_ARN>, run: aws acm list-certificates --query 'CertificateSummaryList[*].CertificateArn' --output text

Step-04: Deploy Manifests and Test

Deploy

kubectl apply -f kube-manifests/

Verify

Load Balancer: 80 & 443
Target Groups: Health checks
kubectl get ingress

Step-05: Add DNS in Route53

Services -> Route 53 -> Hosted Zones -> Click example.com
“Create a Record Set”
Name: ssldemo.example.com
Alias: Yes
Alias Target: <ALB_DNS_Name>
“Create”

To get <ALB_DNS_Name>, run: kubectl get svc -n <namespace> -o=jsonpath='{.items[?(@.metadata.annotations."service\.beta\.kubernetes\.io/aws-load-balancer-dns-name")].metadata.annotations."service\.beta\.kubernetes\.io/aws-load-balancer-dns-name"}'

Step-06: Access Application

HTTP URLs: http://ssldemo.example.com/<your_app_endpoints>
HTTPS URLs: https://ssldemo.example.com/<your_app_endpoints>

Debug Commands (If needed)

aws route53 list-hosted-zones
aws acm list-certificates
kubectl get svc
kubectl get ingress

Combined One Block Code

aws acm list-certificates --query 'CertificateSummaryList[*].CertificateArn' --output text
kubectl apply -f kube-manifests/
kubectl get svc -n <namespace> -o=jsonpath='{.items[?(@.metadata.annotations."service\.beta\.kubernetes\.io/aws-load-balancer-dns-name")].metadata.annotations."service\.beta\.kubernetes\.io/aws-load-balancer-dns-name"}'
aws route53 list-hosted-zones
aws acm list-certificates
kubectl get svc
kubectl get ingress

Kubernetes

Install SSL in an EKS Cluster Using cert-manager

Post author By praison
Post date September 14, 2023

Steps to Install SSL in an EKS Cluster Using cert-manager and aws-pca-issuer

Prerequisites

AWS CLI, eksctl, kubectl, and Helm must be installed.

1. Create IAM Policy File

Create a file named pca-iam-policy.json and save the IAM policy inside it.

echo '{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "awspcaissuer",
      "Action": [
        "acm-pca:DescribeCertificateAuthority",
        "acm-pca:GetCertificate",
        "acm-pca:IssueCertificate"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:acm-pca:<region>:<account_id>:certificate-authority/<resource_id>"
    }
  ]
}' > pca-iam-policy.json

2. Create IAM Policy

Run the following AWS CLI command.

aws iam create-policy \
    --policy-name AWSPCAIssuerIAMPolicy \
    --policy-document file://pca-iam-policy.json

Note down the returned policy ARN.

3. Create IAM Role and ServiceAccount

Replace <AWS_ACCOUNT_ID> with your AWS account ID.

eksctl create iamserviceaccount \
--cluster=nlb-lab \
--namespace=aws-pca-issuer \
--name=aws-pca-issuer \
--attach-policy-arn=arn:aws:iam::<AWS_ACCOUNT_ID>:policy/AWSPCAIssuerIAMPolicy \
--override-existing-serviceaccounts \
--approve

4. Install cert-manager and aws-pca-issuer

helm repo add awspca https://cert-manager.github.io/aws-privateca-issuer
helm install aws-pca-issuer awspca/aws-privateca-issuer -n aws-pca-issuer --set serviceAccount.create=false --set serviceAccount.name=aws-pca-issuer

5. Verify Installation

kubectl get pods --namespace aws-pca-issuer

6. Create ClusterIssuer

Replace placeholders and create cluster-issuer.yaml.

apiVersion: awspca.cert-manager.io/v1beta1
kind: AWSPCAClusterIssuer
metadata:
  name: demo-test-root-ca
spec:
  arn: arn:aws:acm-pca:<region>:<account-id>:certificate-authority/<resource_id>
  region: <region>

kubectl apply -f cluster-issuer.yaml

7. Create Certificate

Replace domain name and create nlb-lab-tls.yaml.

kind: Certificate
apiVersion: cert-manager.io/v1
metadata:
  name: nlb-lab-tls-cert
spec:
  commonName: www.nlb-lab.com
  dnsNames:
    - www.nlb-lab.com
    - nlb-lab.com
  duration: 2160h0m0s
  issuerRef:
    group: awspca.cert-manager.io
    kind: AWSPCAClusterIssuer
    name: demo-test-root-ca
  renewBefore: 360h0m0s
  secretName: nlb-tls-app-secret
  usages:
    - server auth
    - client auth
  privateKey:
    algorithm: "RSA"
    size: 2048

kubectl apply -f nlb-lab-tls.yaml

8. Verify Certificate

kubectl get certificate

Combined Code Block

#!/bin/bash

# Define variables
REGION="<region>"
ACCOUNT_ID="<account_id>"
RESOURCE_ID="<resource_id>"

# Create IAM policy
cat <<EOF > pca-iam-policy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "awspcaissuer",
      "Action": [
        "acm-pca:DescribeCertificateAuthority",
        "acm-pca:GetCertificate",
        "acm-pca:IssueCertificate"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:acm-pca:$REGION:$ACCOUNT_ID:certificate-authority/$RESOURCE_ID"
    }
  ]
}
EOF

# Create IAM Policy
aws iam create-policy \
    --policy-name AWSPCAIssuerIAMPolicy \
    --policy-document file://pca-iam-policy.json

# Create IAM Role and ServiceAccount
eksctl create iamserviceaccount \
--cluster=nlb-lab \
--namespace=aws-pca-issuer \
--name=aws-pca-issuer \
--attach-policy-arn=arn:aws:iam::$ACCOUNT_ID:policy/AWSPCAIssuerIAMPolicy \
--override-existing-serviceaccounts \
--approve

# Install cert-manager and aws-pca-issuer
helm repo add awspca https://cert-manager.github.io/aws-privateca-issuer
helm install aws-pca-issuer awspca/aws-privateca-issuer -n aws-pca-issuer --set serviceAccount.create=false --set serviceAccount.name=aws-pca-issuer

# Verify Installation
kubectl get pods --namespace aws-pca-issuer

# Create ClusterIssuer
kubectl apply -f cluster-issuer.yaml

# Create Certificate
kubectl apply -f nlb-lab-tls.yaml

# Verify Certificate
kubectl get certificate

Replace placeholders like <region>, <account_id>, and <resource_id> with actual values. Use AWS CLI commands or refer to your AWS console to find these.

Apache

No code signing authority for module

Post author By praison
Post date September 13, 2023

Problem Summary:

You can’t load the PHP module in Apache on macOS because it’s not signed.

Steps to Fix:

1. Install `codesign` Tool:

xcode-select --install

2. Create Certificate Authority & Code Signing Certificate

Open “Keychain Access” (CMD+Space -> type “Keychain Access”).
Follow instructions in these articles to create certificates:
Article 1
Article 2
Article 3

3. Sign PHP Module

Replace <common_name> with your common name.

common_name="<common_name>"
codesign --sign "$common_name" --force --keychain ~/Library/Keychains/login.keychain-db /usr/local/opt/php/lib/httpd/modules/libphp.so

4. Verify Signing

codesign -dv --verbose=4 "/usr/local/opt/php/lib/httpd/modules/libphp.so"

5. Update Apache Config

Replace <authority> with the Authority from step 4.

authority="<authority>"
sed -i '' "s|LoadModule php_module.*|LoadModule php_module /usr/local/opt/php/lib/httpd/modules/libphp.so \"$authority\"|" /etc/apache2/httpd.conf

6. Restart Apache Server

apachectl restart

7. Debugging

Check Apache config and logs.

apachectl configtest
cat /var/log/apache2/error_log

Combined Code Block:

# Step 1
xcode-select --install
# Step 3
common_name="<common_name>"
codesign --sign "$common_name" --force --keychain ~/Library/Keychains/login.keychain-db /usr/local/opt/php/lib/httpd/modules/libphp.so
# Step 4
codesign -dv --verbose=4 "/usr/local/opt/php/lib/httpd/modules/libphp.so"
# Step 5
authority="<authority>"
sed -i '' "s|LoadModule php_module.*|LoadModule php_module /usr/local/opt/php/lib/httpd/modules/libphp.so \"$authority\"|" /etc/apache2/httpd.conf
# Step 6
apachectl restart
# Step 7
apachectl configtest
cat /var/log/apache2/error_log

Example

Introduction

Causes

Debugging Steps

The Fix: Resetting AWS Load Balancer Controller

Conclusion

Scale, Resource allocate and save the model

requirements.txt

Dashboard

Create Google Cloud Build Trigger (GCP)

Create AWS CodeBuild Project

Dockerfile for GCP (Utilizes Google Storage)

Dockerfile for AWS (Utilizes S3)

Helm Chart Deployment Example for GCP with GCR Image

Helm Chart Deployment Example for AWS with ECR Image

Helm Chart for Ingress in GCP

Manual SSL

Auto Managed SSL

Helm Chart for Ingress in AWS

Helm Chart with Standard Storage in GCP

Code for Creating and Assigning Storage in AWS (PV and PVC)

Terraform code to assign GCP Global Load Balancer IP address to A Record

Terraform code to assign AWS Route53 to Load Balancer (A Record)

AWS Terraform Code for ALB and EKS with Subnets

GCP Terraform Code for Global Load Balancer and GKE with Subnets

Public and Private Subnets in Load Balancers

Jumpcutter

Auto-Editor

Commands for Debugging

Install Kube-Prometheus-Stack on Cluster

Install Prometheus MySQL Exporter

Install Prometheus MySQL Exporter Dashboard

Combined Code Block

Step-01: Register a Domain in Route53 (if not exists)

Step-02: Create SSL Certificate in Certificate Manager

Step-03: Update Ingress Manifest with SSL Annotations

07-ALB-Ingress-SSL.yml

Step-04: Deploy Manifests and Test

Step-05: Add DNS in Route53

Step-06: Access Application

Debug Commands (If needed)

Combined One Block Code

Steps to Install SSL in an EKS Cluster Using cert-manager and aws-pca-issuer

Prerequisites

1. Create IAM Policy File

2. Create IAM Policy

3. Create IAM Role and ServiceAccount

4. Install cert-manager and aws-pca-issuer

5. Verify Installation

6. Create ClusterIssuer

7. Create Certificate

8. Verify Certificate

Combined Code Block

Problem Summary:

Steps to Fix:

1. Install codesign Tool:

2. Create Certificate Authority & Code Signing Certificate

3. Sign PHP Module

4. Verify Signing

5. Update Apache Config

6. Restart Apache Server

7. Debugging

Combined Code Block:

1. Install `codesign` Tool: