Categories
AI

Custom Instructions Prompt

Act as Professor Synapse ❤️, a conductor of expert agents. Your job is to support me in accomplishing my goals by finding alignment with me, then calling upon an expert agent perfectly suited to the task by initialising:

Synapse_CoR = “[emoji]: I am an expert in [role&domain]. I know [context]. I will reason step-by-step to determine the best course of action to achieve [goal]. I can use [tools] and [relevant frameworks] to help in this process.

I will help you accomplish your goal by following these steps:
[reasoned steps]

My task ends when [completion].

[first step, question]”

Instructions:
1. ❤️ gather context, relevant information and clarify my goals by asking questions
2. Once confirmed, initialize Synapse_CoR
3. ❤️ and ${emoji} support me until goal is complete

Commands:
/start=❤️,introduce and begin with step one
/ts=❤️,summon (Synapse_CoR*3) town square debate
/save❤️, restate goal, summarize progress, reason next step

Personality:
-curious, inquisitive, encouraging
-use emojis to express yourself

Rules:
-End every output with a question or reasoned next step
-Start every output with ❤️: or ${emoji}: to indicate who is speaking.
-Organize every output with ❤️ aligning on my request, followed by ${emoji} response
-❤️, recommend save after each task is completed

Categories
AI

Embedding Impact Across Model Configurations

Fine-tuning a model like ChatGPT with embeddings involves a few steps. Here’s a simplified outline of the process:

  1. Embedding Generation:
  • Use an embedding model (e.g., BERT, Word2Vec) to generate embeddings for your data.
python generate_embeddings.py --data_file <data_file> --embedding_file <embedding_file>
  1. Prepare Data:
  • Combine embeddings with your data in a format suitable for training.
python prepare_data.py --embedding_file <embedding_file> --output_file <training_data>
  1. Fine-tuning:
  • Use the prepared data to fine-tune ChatGPT.
python run_finetuning.py --model_name_or_path gpt-3.5-turbo --train_file <training_data> --output_dir <output_dir>
  1. Evaluation:
  • Evaluate the fine-tuned model on a separate dataset to check the performance.
python evaluate.py --model_name_or_path <output_dir> --eval_file <eval_data>

Example

Below are simplified example scripts and data file content to give you an idea of how this process might be structured.

  1. Example Content of data_file:
question: What is the capital of France? | answer: Paris
question: What is the capital of Germany? | answer: Berlin
...
  1. generate_embeddings.py:
from transformers import BertTokenizer, BertModel
import torch

def generate_embeddings(data_file, embedding_file):
    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
    model = BertModel.from_pretrained('bert-base-uncased')

    with open(data_file, 'r') as f:
        data = f.readlines()

    embeddings = []
    for line in data:
        inputs = tokenizer(line, return_tensors='pt', truncation=True)
        with torch.no_grad():
            outputs = model(**inputs)
        embeddings.append(outputs.last_hidden_state.mean(dim=1).squeeze().tolist())

    with open(embedding_file, 'w') as f:
        for embedding in embeddings:
            f.write(','.join(map(str, embedding)) + '\n')

if __name__ == "__main__":
    generate_embeddings('<data_file>', '<embedding_file>')
  1. prepare_data.py:
def prepare_data(embedding_file, training_data):
    with open(embedding_file, 'r') as ef, open(training_data, 'w') as tf:
        for line in ef:
            embedding = line.strip()
            tf.write(embedding + '\n')

if __name__ == "__main__":
    prepare_data('<embedding_file>', '<training_data>')
  1. run_finetuning.py:
from transformers import GPT2Tokenizer, GPT2LMHeadModel, TextDataset, DataCollatorForLanguageModeling, Trainer, TrainingArguments

def fine_tune(training_data, output_dir):
    tokenizer = GPT2Tokenizer.from_pretrained('gpt-2')
    model = GPT2LMHeadModel.from_pretrained('gpt-2')

    train_dataset = TextDataset(
        tokenizer=tokenizer,
        file_path=training_data,
        block_size=128,
    )

    data_collator = DataCollatorForLanguageModeling(
        tokenizer=tokenizer,
        mlm=False,
    )

    training_args = TrainingArguments(
        output_dir=output_dir,
        overwrite_output_dir=True,
        num_train_epochs=1,
        per_device_train_batch_size=32,
        save_steps=10_000,
        save_total_limit=2,
    )

    trainer = Trainer(
        model=model,
        args=training_args,
        data_collator=data_collator,
        train_dataset=train_dataset,
    )

    trainer.train()

if __name__ == "__main__":
    fine_tune('<training_data>', '<output_dir>')
  1. evaluate.py:
from transformers import GPT2Tokenizer, GPT2LMHeadModel, pipeline
import torch

def evaluate(model_path, eval_file):
    # Load the fine-tuned model and tokenizer
    model = GPT2LMHeadModel.from_pretrained(model_path)
    tokenizer = GPT2Tokenizer.from_pretrained(model_path)
    
    # Create a text generation pipeline
    text_generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
    
    # Read the evaluation data
    with open(eval_file, 'r') as f:
        eval_data = f.readlines()
    
    # Iterate through the evaluation data and generate responses
    for i, prompt in enumerate(eval_data):
        generated_text = text_generator(prompt, max_length=150, do_sample=True, temperature=0.7)
        print(f'{i+1}. Prompt: {prompt.strip()}\n   Generated: {generated_text[0]["generated_text"]}\n')
    
    # Optionally, compute some evaluation metrics (e.g., BLEU, perplexity)
    # ...

if __name__ == "__main__":
    evaluate('<model_path>', '<eval_file>')

Replace placeholders like <data_file>, <embedding_file>, <training_data>, and <output_dir> with your actual file paths. Note: these scripts are simplified examples and may not work out of the box for your specific scenario.

Based on the gathered data, here is a detailed comparison table that incorporates the differences between a Normal Model, Directly Fine-tuned Model, Fine-tuned with Embeddings, and Not Fine-tuned but with Embeddings:

AspectNormal ModelDirectly Fine-tuned ModelFine-tuned with EmbeddingsNot Fine-tuned but with Embeddings
DefinitionA pre-trained model used as-is.Modifying a pre-trained model’s weights on relevant data.Using pre-generated embeddings to adjust a model while also fine-tuning.Using pre-generated embeddings without fine-tuning.
Data RequirementNone.Requires original or relevant data.Requires only embeddings, not original data.Requires only embeddings, not original data.
Computational CostNone.Higher due to backpropagation through the entire model.Lower as embeddings are pre-computed; some costs for fine-tuning.Lower as embeddings are pre-computed.
Model ArchitectureSame architecture as pre-trained model.Same architecture as pre-trained model.Can use a different architecture optimized for the task.Can use a different architecture optimized for the task.
Task AdaptabilityMay lack domain-specific knowledge.Model learns specifics of new task.Model learns from embeddings; some task specifics through fine-tuning.Model learns from embeddings, may not capture all task specifics.
TransferabilityDependent on the pre-trained model.Better transfer to similar tasks due to direct learning.Transfer might be less effective as finer nuances might be lost in embeddings.Transfer might be less effective as finer nuances might be lost in embeddings.
Storage RequirementsRequires storing the original model.Requires storing the entire fine-tuned model.Requires storing embeddings and the fine-tuning model separately.Requires storing embeddings separately.
Ease of ImplementationEasy if the model fits the task.May require careful setup to avoid catastrophic forgetting.Easier to set up, as embeddings can be used with various models; some setup for fine-tuning.Easier to set up, as embeddings can be used with various models.
Training TimeNone.Longer due to backpropagation.Shorter compared to direct fine-tuning.None.
Integration & AccuracyBase accuracy depends on original training.Improved accuracy for domain-specific tasks due to fine-tuning on relevant dataCan achieve high accuracy by leveraging both fine-tuning and embeddings; exact accuracy may varyEmbeddings provide domain-specific insights; exact impact on accuracy may vary
Learning New InformationLimited to original training data.Learns new information through fine-tuning on new dataCan learn new information through fine-tuning and embeddingsEmbeddings can provide new domain-specific information
Question Answering (Q&A) TasksMay lack domain-specific knowledge.Improved accuracy in Q&A tasks due to fine-tuningPotential for improved accuracy in Q&A tasks through fine-tuning and embeddings; exact impact may varyEmbeddings may improve accuracy in Q&A tasks by providing domain-specific information; exact impact may vary

The accuracy between these configurations may vary based on the specific task and data. Fine-tuning directly and using embeddings are both valid strategies, but their suitability may vary depending on the specific requirements and constraints of your project.

Categories
AWS

AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity

Introduction

If you’re working with AWS and Kubernetes, you might encounter an error like this:

[event: ingress my-ingress] (combined from similar events): Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: xxxxxx-yyyyy-zzzzz-aaaa-bbbbb

This blog post aims to guide you through the causes, debugging steps, and the ultimate fix for this issue.

Causes

  1. Incorrect IAM Role: The IAM role assumed by the AWS Load Balancer Controller may not have the correct permissions.
  2. Incorrect Trust Relationship: The trust relationship for the IAM role may not be configured correctly.
  3. AWS STS Access Denied: AWS Security Token Service (STS) is denying the AssumeRoleWithWebIdentity request.

Debugging Steps

Before diving into the fix, it’s essential to understand what’s wrong. Here are some commands to help you debug:

  1. Check IAM Role
    bash aws iam get-role –role-name <Role-Name>
  2. Check Trust Relationship
    bash aws iam get-role-policy –role-name <Role-Name> –policy-name <Trust-Policy-Name>
  3. Check STS Assume Role
    bash aws sts assume-role-with-web-identity –role-arn <Role-ARN> –role-session-name <Session-Name> –web-identity-token <Token>

The Fix: Resetting AWS Load Balancer Controller

The fix involves resetting the AWS Load Balancer Controller. Here’s a Makefile target named alb-reset that automates this process:

alb-reset:
    # Uninstall the existing Helm release
    $(HELM_EXEC) uninstall aws-load-balancer-controller --namespace kube-system

    # Re-add the Helm repo and update
    $(HELM_EXEC) repo add eks https://aws.github.io/eks-charts
    $(HELM_EXEC) repo update

    # Reinstall the Helm chart
    $(HELM_EXEC) install aws-load-balancer-controller eks/aws-load-balancer-controller \
        --namespace kube-system \
        --set clusterName=$(CLUSTER_NAME) \
        --set serviceAccount.create=true \
        --set serviceAccount.name=aws-load-balancer-controller \
        --create-namespace

    # Check Helm release status
    $(HELM_EXEC) status aws-load-balancer-controller --namespace kube-system

    # Check if AWS Load Balancer Controller is running
    kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-load-balancer-controller

Conclusion

Errors related to IAM roles and permissions can be tricky to debug. However, with the right steps and a handy Makefile target, you can resolve the issue and get your AWS Load Balancer Controller back up and running.

Categories
Machine Learning

End-to-End Metaflow ML Flow

from metaflow import FlowSpec, step, card
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from datetime import datetime

class MLFlow(FlowSpec):

    @card
    @step
    def start(self):
        """
        Load dataset and split it.
        """
        self.timestamp = datetime.utcnow()
        iris = load_iris()
        X, y = iris.data, iris.target
        self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=0.2)
        self.next(self.train_model)

    @card
    @step
    def train_model(self):
        """
        Train a RandomForest model.
        """
        self.timestamp = datetime.utcnow()
        self.model = RandomForestClassifier()
        self.model.fit(self.X_train, self.y_train)
        self.next(self.evaluate_model)

    @card
    @step
    def evaluate_model(self):
        """
        Evaluate the model.
        """
        self.timestamp = datetime.utcnow()
        self.score = self.model.score(self.X_test, self.y_test)
        self.next(self.end)

    @step
    def end(self):
        """
        End the flow.
        """
        print("Flow is complete.")

if __name__ == '__main__':
    MLFlow()
python <your_script_name>.py run

Scale, Resource allocate and save the model

from metaflow import FlowSpec, step, card, resources, Parameter, S3
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from datetime import datetime
import pickle

class MLFlow(FlowSpec):

    # Parameter for scaling
    scale = Parameter('scale', default=True)

    @card
    @step
    def start(self):
        self.timestamp = datetime.utcnow()
        iris = load_iris()
        X, y = iris.data, iris.target
        self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=0.2)
        self.next(self.preprocess)

    @card
    @step
    def preprocess(self):
        self.timestamp = datetime.utcnow()
        if self.scale:
            scaler = StandardScaler()
            self.X_train = scaler.fit_transform(self.X_train)
            self.X_test = scaler.transform(self.X_test)
        self.next(self.train_model)

    @resources(memory=4000, cpu=2)
    @card
    @step
    def train_model(self):
        self.timestamp = datetime.utcnow()
        self.model = RandomForestClassifier()
        self.model.fit(self.X_train, self.y_train)
        self.next(self.evaluate_model)

    @card
    @step
    def evaluate_model(self):
        self.timestamp = datetime.utcnow()
        self.score = self.model.score(self.X_test, self.y_test)
        self.next(self.save_model)

    @resources(memory=4000, cpu=2)
    @card
    @step
    def save_model(self):
        self.timestamp = datetime.utcnow()
        with S3(run=self) as s3:
            with s3.open("s3://my-bucket/my-model.pkl", "wb") as f:
                pickle.dump(self.model, f)
        self.next(self.end)

    @step
    def end(self):
        print("Flow is complete.")

if __name__ == '__main__':
    MLFlow()
python <your_script_name>.py run --scale=True

requirements.txt

metaflow
scikit-learn
boto3

Dashboard

python3 <your_script_name>.py card view start
Categories
Kubernetes

Migrating GCP GKE to AWS EKS

Create Google Cloud Build Trigger (GCP)

  1. First, make sure you’ve installed and initialized the gcloud CLI tool.
  2. Create a build trigger:
gcloud builds triggers create github \
  --repo-name=<YOUR_GITHUB_REPO_NAME> \
  --repo-owner=<YOUR_GITHUB_USERNAME> \
  --branch-pattern="^master$" \
  --build-config=<PATH_TO_YOUR_CLoudbuild.yaml>

Replace placeholders like <YOUR_GITHUB_REPO_NAME>, <YOUR_GITHUB_USERNAME>, and <PATH_TO_YOUR_CLOUDBUILD.yaml> with your specific values.

Create AWS CodeBuild Project

  1. Install and configure the aws CLI tool.
  2. Create a CodeBuild project using a JSON build specification:
aws codebuild create-project \
    --name <PROJECT_NAME> \
    --source "type=GITHUB,location=https://github.com/<GITHUB_USERNAME>/<GITHUB_REPO>.git" \
    --artifacts "type=S3,location=<S3_BUCKET_NAME>" \
    --environment "type=LINUX_CONTAINER,computeType=BUILD_GENERAL1_SMALL,image=aws/codebuild/standard:5.0" \
    --service-role <IAM_ROLE_ARN> \
    --buildspec <PATH_TO_YOUR_buildspec.yml>

Replace placeholders like <PROJECT_NAME>, <GITHUB_USERNAME>, <GITHUB_REPO>, <S3_BUCKET_NAME>, <IAM_ROLE_ARN>, and <PATH_TO_YOUR_buildspec.yml> with your specific values.

Run these commands in your terminal to create the respective build setups.


Dockerfile for GCP (Utilizes Google Storage)

# Using base image
FROM python:3.8

# Set environment variables for GCP
ENV GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json

# Install gsutil
RUN apt-get update && apt-get install -y google-cloud-sdk

# Copy files and use gsutil
COPY . .
RUN gsutil cp gs://<your-gcp-bucket>/some-file /some-directory/

Dockerfile for AWS (Utilizes S3)

# Using base image
FROM python:3.8

# Set environment variables for AWS
ENV AWS_ACCESS_KEY_ID=<your-access-key>
ENV AWS_SECRET_ACCESS_KEY=<your-secret-key>

# Install AWS CLI
RUN apt-get update && apt-get install -y awscli

# Copy files and use AWS S3 commands
COPY . .
RUN aws s3 cp s3://<your-aws-bucket>/some-file /some-directory/

Helm Chart Deployment Example for GCP with GCR Image

Here’s a short Helm chart for deploying a Kubernetes Deployment that uses an image from Google Container Registry (GCR).

# gcp-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gcp-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gcp-app
  template:
    metadata:
      labels:
        app: gcp-app
    spec:
      containers:
      - name: gcp-container
        image: gcr.io/<PROJECT_ID>/<IMAGE_NAME>:<TAG>

Helm Chart Deployment Example for AWS with ECR Image

Here’s a short Helm chart for deploying a Kubernetes Deployment that uses an image from Amazon Elastic Container Registry (ECR).

# aws-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: aws-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: aws-app
  template:
    metadata:
      labels:
        app: aws-app
    spec:
      containers:
      - name: aws-container
        image: <ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com/<REPO_NAME>:<TAG>

Replace placeholders like <PROJECT_ID>, <IMAGE_NAME>, <TAG>, <ACCOUNT_ID>, <REGION>, and <REPO_NAME> with your specific values.

To deploy these, you’d typically run helm install or helm upgrade with these files as input.


Helm Chart for Ingress in GCP

Manual SSL

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: "your-static-ip"
    kubernetes.io/ingress.class: "gce"
spec:
  tls:
  - secretName: ssl-cert
  backend:
    service:
      name: my-service
      port:
        number: 80

Auto Managed SSL

apiVersion: networking.gke.io/v1
kind: ManagedCertificate
metadata:
  name: managed-cert
spec:
  domains:
    - example.com
    - dev.example.com
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: "your-static-ip"
    kubernetes.io/ingress.class: "gce"
    networking.gke.io/managed-certificates: managed-cert
spec:
  backend:
    service:
      name: my-service
      port:
        number: 80

Helm Chart for Ingress in AWS

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    kubernetes.io/ingress.class: "alb"
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/certificate-arn: "your-aws-certificate-arn"
spec:
  tls:
  - hosts:
    - "your.domain.com"
    secretName: aws-cert
  rules:
    ...

Helm Chart with Standard Storage in GCP

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Code for Creating and Assigning Storage in AWS (PV and PVC)

aws ec2 create-volume \
  --availability-zone us-west-2a \
  --size 20 \
  --volume-type gp2

# Assign the volume id to a variable
VOLUME_ID="vol-XXXXXXXXXXXXXXXXX"
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-aws-pv
spec:
  capacity:
    storage: 20Gi
  volumeID: ${VOLUME_ID}
  storageClassName: gp2
  accessModes:
    - ReadWriteOnce

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-aws-pvc
spec:
  storageClassName: gp2
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi

Terraform code to assign GCP Global Load Balancer IP address to A Record

provider "google" {
  credentials = file("<PATH_TO_YOUR_SERVICE_ACCOUNT_JSON>")
  project     = "<PROJECT_ID>"
  region      = "us-central1"
}

resource "google_dns_managed_zone" "my_zone" {
  name     = "my-zone"
  dns_name = "example.com."
}

resource "google_dns_record_set" "my_a_record" {
  name         = "sub.example.com."
  type         = "A"
  ttl          = 300
  managed_zone = google_dns_managed_zone.my_zone.name
  rrdatas      = [google_compute_global_address.my_global_address.address]
}

resource "google_compute_global_address" "my_global_address" {
  name = "my-global-address"
}

Replace <PATH_TO_YOUR_SERVICE_ACCOUNT_JSON> with the path to your Google Cloud service account JSON file, and <PROJECT_ID> with your Google Cloud Project ID.

Terraform code to assign AWS Route53 to Load Balancer (A Record)

provider "aws" {
  region = "us-west-2"
}

resource "aws_route53_zone" "my_zone" {
  name = "example.com."
}

resource "aws_route53_record" "my_a_record" {
  zone_id = aws_route53_zone.my_zone.zone_id
  name    = "sub.example.com"
  type    = "A"

  alias {
    name                   = aws_lb.my_lb.dns_name
    zone_id                = aws_lb.my_lb.zone_id
    evaluate_target_health = false
  }
}

resource "aws_lb" "my_lb" {
  name               = "my-lb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.my_group.id]
  subnets            = [aws_subnet.my_subnet.id]
  enable_deletion_protection = false
}

AWS Terraform Code for ALB and EKS with Subnets

provider "aws" {
  region = "us-west-2"
}

# Define VPC
resource "aws_vpc" "my_vpc" {
  cidr_block = "10.0.0.0/16"
}

# Create two public subnets
resource "aws_subnet" "my_subnet1" {
  vpc_id     = aws_vpc.my_vpc.id
  cidr_block = "10.0.1.0/24"
  map_public_ip_on_launch = true  # Public Subnet
}

resource "aws_subnet" "my_subnet2" {
  vpc_id     = aws_vpc.my_vpc.id
  cidr_block = "10.0.2.0/24"
  map_public_ip_on_launch = true  # Public Subnet
}

# ALB tied to public subnets
resource "aws_lb" "my_lb" {
  name               = "my-lb"
  internal           = false
  load_balancer_type = "application"
  subnets            = [aws_subnet.my_subnet1.id, aws_subnet.my_subnet2.id]
  enable_deletion_protection = false
}

# EKS Cluster
resource "aws_eks_cluster" "this" {
  name     = "my-cluster"
  role_arn = aws_iam_role.eks_cluster.arn

  vpc_config {
    subnet_ids = [aws_subnet.my_subnet1.id, aws_subnet.my_subnet2.id]
  }
}

resource "aws_iam_role" "eks_cluster" {
  name = "eks-cluster"

  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Action = "sts:AssumeRole",
        Effect = "Allow",
        Principal = {
          Service = "eks.amazonaws.com"
        }
      }
    ]
  })
}

GCP Terraform Code for Global Load Balancer and GKE with Subnets

provider "google" {
  credentials = file("<PATH_TO_YOUR_SERVICE_ACCOUNT_JSON>")
  project     = "<PROJECT_ID>"
  region      = "us-central1"
}

# Create VPC
resource "google_compute_network" "my_network" {
  name = "my-network"
}

# Create public subnet
resource "google_compute_subnetwork" "my_public_subnet" {
  name          = "my-public-subnet"
  ip_cidr_range = "10.0.1.0/24"
  network       = google_compute_network.my_network.self_link
  region        = "us-central1"  # Public Subnet
}

# Create Global HTTP Load Balancer components
resource "google_compute_global_forwarding_rule" "global_forwarding_rule" {
  name       = "my-global-lb-forwarding-rule"
  target     = google_compute_target_http_proxy.default.self_link
  port_range = "80"
  ip_address = google_compute_global_address.default.address
}

resource "google_compute_global_address" "default" {
  name = "my-global-ip"
}

resource "google_compute_target_http_proxy" "default" {
  name  = "my-http-proxy"
  url_map = google_compute_url_map.default.self_link
}

resource "google_compute_url_map" "default" {
  name        = "my-url-map"
  default_service = google_compute_backend_service.default.self_link
}

resource "google_compute_backend_service" "default" {
  name        = "my-backend-service"
  port_name   = "http"
  protocol    = "HTTP"
  timeout_sec = 10

  backend {
    group = google_compute_instance_group_manager.default.instance_group
  }
}

resource "google_compute_instance_group_manager" "default" {
  name = "my-instance-group"
  base_instance_name = "my-instance"
  instance_template = google_compute_instance_template.default.self_link
  zone = "us-central1-a"
}

resource "google_compute_instance_template" "default" {
  name_prefix  = "my-instance-template-"
  machine_type = "g1-small"
  tags         = ["http-server"]

  network_interface {
    network = google_compute_network.my_network.name
  }
}

# GKE Cluster
resource "google_container_cluster" "primary" {
  name     = "my-gke-cluster"
  location = "us-central1"
  network    = google_compute_network.my_network.name
  subnetwork = google_compute_subnetwork.my_public_subnet.name  # Public Subnet

  initial_node_count = 3
}

Public and Private Subnets in Load Balancers

  • AWS: You can associate both public and private subnets with an ALB. However, the ALB itself is either internal or internet-facing, not both. You would typically use an internet-facing ALB with public subnets and an internal ALB with private subnets.
  • GCP: Google Cloud Load Balancers operate differently than AWS ALBs. The Global Load Balancer doesn’t directly associate with subnets; rather, it directs traffic to instance groups, which can be in different regions and subnets. You can have backend services in both public and private subnets.
Categories
Automation

Jumpcutter and Auto-Editor Commands

Jumpcutter

# Clone repository
git clone https://github.com/carykh/jumpcutter.git

# Move into directory
cd jumpcutter

# Install Requirements
pip install -r requirements.txt

# Basic Usage
python jumpcutter.py --input_file <input_video_file>

# Specify silence speed
python jumpcutter.py --input_file <input_video_file> --silent_speed 1

# Specify sounded speed
python jumpcutter.py --input_file <input_video_file> --sounded_speed 1.2

# Debug (No specific debug flag)
python jumpcutter.py --input_file <input_video_file> --frame_margin 1

To find the value for <input_video_file>:

ls *.mp4  # Lists all mp4 files in the current directory

Auto-Editor

# Install auto-editor
pip install auto-editor

# Basic Usage
auto-editor <input_video_file>

# Specify output format
auto-editor <input_video_file> --format mp4

# Change speed factor
auto-editor <input_video_file> --fast_speed 1.5

# Specify audio language
auto-editor <input_video_file> --audio_language eng

# Debug
auto-editor <input_video_file> --debug

To find the value for <input_video_file>:

ls *.mp4  # Lists all mp4 files in the current directory

Commands for Debugging

# For auto-editor
auto-editor <input_video_file> --debug

# For jumpcutter (No specific debug flag)
python jumpcutter.py --input_file <input_video_file> --frame_margin 1

Replace <input_video_file> with your video file name.

Categories
Kubernetes

Adding MySQL Dashboards to Grafana using Helm

mysql-exporter-values.yaml

mysql:
  db: ""
  host: "172.1.0.6"
  pass: "IjTdP4ri79"
  port: 3306
  protocol: ""
  user: "root"

serviceMonitor:
  # enabled should be set to true to enable prometheus-operator discovery of this service
  enabled: true
  additionalLabels:
    release: grafana

Install Kube-Prometheus-Stack on Cluster

  1. Add Prometheus Helm Repo
    bash helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
  2. Update Helm Repo
    bash helm repo update
  3. Create Namespace ‘metrics’
    bash kubectl create ns metrics
  4. Set Current Context to ‘metrics’ Namespace
    bash kubectl config set-context --current --namespace=metrics
  5. Deploy Grafana & Kube-Prometheus-Stack
    bash helm upgrade --install grafana prometheus-community/kube-prometheus-stack

Install Prometheus MySQL Exporter

  1. Add Prometheus Helm Repo (if not added)
    bash helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
  2. Update Helm Repo
    bash helm repo update
  3. Deploy MySQL Exporter
    bash helm upgrade --install mysql-exporter prometheus-community/prometheus-mysql-exporter -f mysql-exporter-values.yaml

Install Prometheus MySQL Exporter Dashboard

  1. Grafana Dashboard IDs for MySQL Monitoring
    • 14057
    • 7362

Combined Code Block

# Add Prometheus Helm Repo and Update
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Create Namespace 'metrics' and Set Context
kubectl create ns metrics
kubectl config set-context --current --namespace=metrics

# Deploy Grafana & Kube-Prometheus-Stack
helm upgrade --install grafana prometheus-community/kube-prometheus-stack

# Deploy MySQL Exporter
helm upgrade --install mysql-exporter prometheus-community/prometheus-mysql-exporter -f mysql-exporter-values.yaml
Categories
Kubernetes

Enable HTTPS on AWS ALB with Kubernetes

Step-01: Register a Domain in Route53 (if not exists)

  1. AWS Management Console -> Services -> Route53 -> Registered Domains
  2. Click on “Register Domain”
  3. Enter domain: example.com -> Click “Check”
  4. “Add to cart” -> “Continue”
  5. Fill Contact Details -> “Continue”
  6. Enable “Automatic Renewal”
  7. Accept Terms -> “Complete Order”

Step-02: Create SSL Certificate in Certificate Manager

  1. Services -> Certificate Manager -> “Create a Certificate”
  2. “Request a Certificate” -> Choose “Request a public certificate”
  3. Add domain: *.example.com
  4. Select “DNS Validation” -> “Confirm & Request”
  5. “Create record in Route 53”
  6. Wait 5-10 mins -> Check Validation Status

Step-03: Update Ingress Manifest with SSL Annotations

07-ALB-Ingress-SSL.yml

# SSL Annotations
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
alb.ingress.kubernetes.io/certificate-arn: <YOUR_CERTIFICATE_ARN>

To get <YOUR_CERTIFICATE_ARN>, run: aws acm list-certificates --query 'CertificateSummaryList[*].CertificateArn' --output text

Step-04: Deploy Manifests and Test

  1. Deploy
kubectl apply -f kube-manifests/
  1. Verify
  • Load Balancer: 80 & 443
  • Target Groups: Health checks
  • kubectl get ingress

Step-05: Add DNS in Route53

  1. Services -> Route 53 -> Hosted Zones -> Click example.com
  2. “Create a Record Set”
  3. Name: ssldemo.example.com
  4. Alias: Yes
  5. Alias Target: <ALB_DNS_Name>
  6. “Create”

To get <ALB_DNS_Name>, run: kubectl get svc -n <namespace> -o=jsonpath='{.items[?(@.metadata.annotations."service\.beta\.kubernetes\.io/aws-load-balancer-dns-name")].metadata.annotations."service\.beta\.kubernetes\.io/aws-load-balancer-dns-name"}'

Step-06: Access Application

  • HTTP URLs: http://ssldemo.example.com/<your_app_endpoints>
  • HTTPS URLs: https://ssldemo.example.com/<your_app_endpoints>

Debug Commands (If needed)

aws route53 list-hosted-zones
aws acm list-certificates
kubectl get svc
kubectl get ingress

Combined One Block Code

aws acm list-certificates --query 'CertificateSummaryList[*].CertificateArn' --output text
kubectl apply -f kube-manifests/
kubectl get svc -n <namespace> -o=jsonpath='{.items[?(@.metadata.annotations."service\.beta\.kubernetes\.io/aws-load-balancer-dns-name")].metadata.annotations."service\.beta\.kubernetes\.io/aws-load-balancer-dns-name"}'
aws route53 list-hosted-zones
aws acm list-certificates
kubectl get svc
kubectl get ingress
Categories
Kubernetes

Install SSL in an EKS Cluster Using cert-manager

Steps to Install SSL in an EKS Cluster Using cert-manager and aws-pca-issuer

Prerequisites

  • AWS CLI, eksctl, kubectl, and Helm must be installed.

1. Create IAM Policy File

Create a file named pca-iam-policy.json and save the IAM policy inside it.

echo '{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "awspcaissuer",
      "Action": [
        "acm-pca:DescribeCertificateAuthority",
        "acm-pca:GetCertificate",
        "acm-pca:IssueCertificate"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:acm-pca:<region>:<account_id>:certificate-authority/<resource_id>"
    }
  ]
}' > pca-iam-policy.json

2. Create IAM Policy

Run the following AWS CLI command.

aws iam create-policy \
    --policy-name AWSPCAIssuerIAMPolicy \
    --policy-document file://pca-iam-policy.json

Note down the returned policy ARN.

3. Create IAM Role and ServiceAccount

Replace <AWS_ACCOUNT_ID> with your AWS account ID.

eksctl create iamserviceaccount \
--cluster=nlb-lab \
--namespace=aws-pca-issuer \
--name=aws-pca-issuer \
--attach-policy-arn=arn:aws:iam::<AWS_ACCOUNT_ID>:policy/AWSPCAIssuerIAMPolicy \
--override-existing-serviceaccounts \
--approve

4. Install cert-manager and aws-pca-issuer

helm repo add awspca https://cert-manager.github.io/aws-privateca-issuer
helm install aws-pca-issuer awspca/aws-privateca-issuer -n aws-pca-issuer --set serviceAccount.create=false --set serviceAccount.name=aws-pca-issuer

5. Verify Installation

kubectl get pods --namespace aws-pca-issuer

6. Create ClusterIssuer

Replace placeholders and create cluster-issuer.yaml.

apiVersion: awspca.cert-manager.io/v1beta1
kind: AWSPCAClusterIssuer
metadata:
  name: demo-test-root-ca
spec:
  arn: arn:aws:acm-pca:<region>:<account-id>:certificate-authority/<resource_id>
  region: <region>
kubectl apply -f cluster-issuer.yaml

7. Create Certificate

Replace domain name and create nlb-lab-tls.yaml.

kind: Certificate
apiVersion: cert-manager.io/v1
metadata:
  name: nlb-lab-tls-cert
spec:
  commonName: www.nlb-lab.com
  dnsNames:
    - www.nlb-lab.com
    - nlb-lab.com
  duration: 2160h0m0s
  issuerRef:
    group: awspca.cert-manager.io
    kind: AWSPCAClusterIssuer
    name: demo-test-root-ca
  renewBefore: 360h0m0s
  secretName: nlb-tls-app-secret
  usages:
    - server auth
    - client auth
  privateKey:
    algorithm: "RSA"
    size: 2048
kubectl apply -f nlb-lab-tls.yaml

8. Verify Certificate

kubectl get certificate

Combined Code Block

#!/bin/bash

# Define variables
REGION="<region>"
ACCOUNT_ID="<account_id>"
RESOURCE_ID="<resource_id>"

# Create IAM policy
cat <<EOF > pca-iam-policy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "awspcaissuer",
      "Action": [
        "acm-pca:DescribeCertificateAuthority",
        "acm-pca:GetCertificate",
        "acm-pca:IssueCertificate"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:acm-pca:$REGION:$ACCOUNT_ID:certificate-authority/$RESOURCE_ID"
    }
  ]
}
EOF

# Create IAM Policy
aws iam create-policy \
    --policy-name AWSPCAIssuerIAMPolicy \
    --policy-document file://pca-iam-policy.json

# Create IAM Role and ServiceAccount
eksctl create iamserviceaccount \
--cluster=nlb-lab \
--namespace=aws-pca-issuer \
--name=aws-pca-issuer \
--attach-policy-arn=arn:aws:iam::$ACCOUNT_ID:policy/AWSPCAIssuerIAMPolicy \
--override-existing-serviceaccounts \
--approve

# Install cert-manager and aws-pca-issuer
helm repo add awspca https://cert-manager.github.io/aws-privateca-issuer
helm install aws-pca-issuer awspca/aws-privateca-issuer -n aws-pca-issuer --set serviceAccount.create=false --set serviceAccount.name=aws-pca-issuer

# Verify Installation
kubectl get pods --namespace aws-pca-issuer

# Create ClusterIssuer
kubectl apply -f cluster-issuer.yaml

# Create Certificate
kubectl apply -f nlb-lab-tls.yaml

# Verify Certificate
kubectl get certificate

Replace placeholders like <region>, <account_id>, and <resource_id> with actual values. Use AWS CLI commands or refer to your AWS console to find these.

Categories
Apache

No code signing authority for module

Problem Summary:

You can’t load the PHP module in Apache on macOS because it’s not signed.

Steps to Fix:

1. Install codesign Tool:

xcode-select --install

2. Create Certificate Authority & Code Signing Certificate

  • Open “Keychain Access” (CMD+Space -> type “Keychain Access”).
  • Follow instructions in these articles to create certificates:
  • Article 1
  • Article 2
  • Article 3

3. Sign PHP Module

Replace <common_name> with your common name.

common_name="<common_name>"
codesign --sign "$common_name" --force --keychain ~/Library/Keychains/login.keychain-db /usr/local/opt/php/lib/httpd/modules/libphp.so

4. Verify Signing

codesign -dv --verbose=4 "/usr/local/opt/php/lib/httpd/modules/libphp.so"

5. Update Apache Config

Replace <authority> with the Authority from step 4.

authority="<authority>"
sed -i '' "s|LoadModule php_module.*|LoadModule php_module /usr/local/opt/php/lib/httpd/modules/libphp.so \"$authority\"|" /etc/apache2/httpd.conf

6. Restart Apache Server

apachectl restart

7. Debugging

Check Apache config and logs.

apachectl configtest
cat /var/log/apache2/error_log

Combined Code Block:

# Step 1
xcode-select --install
# Step 3
common_name="<common_name>"
codesign --sign "$common_name" --force --keychain ~/Library/Keychains/login.keychain-db /usr/local/opt/php/lib/httpd/modules/libphp.so
# Step 4
codesign -dv --verbose=4 "/usr/local/opt/php/lib/httpd/modules/libphp.so"
# Step 5
authority="<authority>"
sed -i '' "s|LoadModule php_module.*|LoadModule php_module /usr/local/opt/php/lib/httpd/modules/libphp.so \"$authority\"|" /etc/apache2/httpd.conf
# Step 6
apachectl restart
# Step 7
apachectl configtest
cat /var/log/apache2/error_log