Definitions

127 definitions

🧬 Abrogated

🧬 Alternative Splicing

⚓ Anchors (URL Fragments)

📊 ANSI SQL

🧬 AT-hooks

🧪 ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing)

👀 Attention Mechanism

🔧 Autograd

↩️ Backpropagation

🧮 BAF and PBAF Complexes

🔄 Canonical Polyadic (CP) Decomposition

⛓️ Chain Rule

🧬 Chimeric

🧩 Chinese Remainder Theorem

🔬 ChIP-seq (Chromatin Immunoprecipitation Sequencing)

🔄 Chirality

🧪 Chromatin Remodeling

🔍 Chromodomains and Bromodomains

🧬 Cis DNA Regulatory Elements

👁️ Convolutional Neural Network (CNN)

🌐 CORS (Cross-Origin Resource Sharing)

📊 Cross Entropy Loss

🧪 Cytotoxicity

⚖️ Dalton (Da)

🔬 Deconvolution Analysis

🔮 Deep Learning

🦕 Deno

🔐 Diffie-Hellman Key Exchange

🔄 Dimers

🌐 DNS (Domain Name System)

🏠 Domain

📚 Edify

🔀 Elastic Net

🔠 Embedding

🧬 Endogenous

🧬 Epochs

📐 Euler's Number (e)

🔢 Exponent Rules

🎯 Fine-tuning

🔢🔍 Float (Floating-Point Number)

🔬📊 Floating-Point Precision

🧬 Gene Fusion Events

🌐 Generalized CP (GCP) Decomposition

🎨 Generative Adversarial Network (GAN)

🔍✅ Git Triage:

⬇️ Gradient Descent

🧠 Hebbian Theory

🧩 Hi-C (Chromosome Conformation Capture + Sequencing)

🧬 HLA Imputation

🧬 Homologs

🌐 Internet Protocol (IP)

🌐 IP Address (Internet Protocol Address)

🧬 Isoforms

🧪 Kinases

📊 Lasso Regression

🧠 Long Short-Term Memory (LSTM)

📊 Loss Function

🤖 Machine Learning

🔬 Mechanistic Modeling

📊 Mixed-Effects Models

🧬 Moiety

📈 Monotonic

🔄 Monovalent Molecular Glue Degraders

🧩 Mosaicism Detection

🔗 Multi-Layer Perceptron (MLP)

🩸 Myelodysplastic Syndrome (MDS)

🧠 Neural Network

🟢 Node.js

📈 Non-parametric Statistics

🔍⏱️ NP (Nondeterministic Polynomial Time)

🧩🔄 NP-Complete

🏋️‍♂️🧠 NP-Hard

⚡🎯 Optimizer

🧬 Orthologs

📈 Overfitting

⏱️✓ P (Polynomial Time)

🧬 Paralogs

📊 Parametric Statistics

📊 Pearson Correlation Coefficient

🧬 Pleiotropy

🔄 Pluripotency

🧮 Pohlig-Hellman Algorithm

🚪 Ports

📍 Positional Encoding

⌨️ Prompt Engineering

🧬 PROteolysis TArgeting Chimeras (PROTACs)

🔗 Protocol/Scheme

➗ Quotient Rule

🔄 Recurrent Neural Network (RNN)

🎮 Reinforcement Learning

⚡ ReLU (Rectified Linear Unit)

🔄🚀 Repository Dispatch:

🌐 RESTful API

📈 Ridge Regression

🧬 RNA-seq (RNA sequencing)

🧬 sgRNA (Single Guide RNA)

🔒🧮 SHA-256 Checksum

🔗 Similarity Network Fusion (SNF)

🔓 Small Subgroup Vulnerabilities

📊 Softmax

🧬 Somatic Mutations

🔐💻 SSH (Secure Shell)

🔄 Stereoisomers

🎲 Stochastic

🌐 Subdomain

👨‍🏫 Supervised Learning

🧪 Svedberg Sedimentation Coefficient

🎯 Targeted Protein Degradation

🤝📬 TCP (Transmission Control Protocol)

📊 Tensor Decomposition

🔢 Tensor Processing Unit (TPU)

🔒 TLS (Transport Layer Security)

✂️ Tokenization

🧬 Transcription Factor (TF) Networks

🔄 Transformer

🧩 Tucker Decomposition

🔄 Ubiquitin Proteasome System (UPS)

🚀📨 UDP (User Datagram Protocol)

🔍 Unsupervised Learning

🔗 URL (Uniform Resource Locator)

🗜️ Variational Autoencoder (VAE)

🗄️ Vector Database

🔌💻 Verilog

🧫🔬 Western Blot/Immunoblot

📝 Word Embedding

🧬 Xenologs

🧬 Zinc Fingers

mathmachine-learning

A fundamental rule in calculus for finding the derivative of composite functions. The chain rule states that if you have a composite function f(g(x)), then its derivative is the derivative of the outer function evaluated at the inner function, multiplied by the derivative of the inner function.

Mathematically expressed as:

\frac{d}{dx}[f(g(x))] = f'(g(x)) \cdot g'(x)

Or in Leibniz notation:

\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx}

The chain rule is essential for differentiating complex functions and is widely used in calculus, physics, engineering, and (particularly in algorithms for ). Common applications include finding derivatives of exponential functions, trigonometric functions with inner functions, and nested polynomial expressions.

🧬 Chimeric

dictionarybiology

Refers to something that is made from parts originating from different sources—combined into a single entity.

🧩 Chinese Remainder Theorem

math

A fundamental result in number theory that provides a solution to systems of simultaneous linear congruences with coprime moduli. The theorem states that if one has several congruence equations, a unique solution exists modulo the product of the moduli, provided that the moduli are pairwise coprime.

Formally, if n₁, n₂, ..., nₖ are pairwise coprime positive integers and a₁, a₂, ..., aₖ are any integers, then the system of congruences x ≡ a₁ (mod n₁), x ≡ a₂ (mod n₂), ..., x ≡ aₖ (mod nₖ) has a unique solution modulo N = n₁ × n₂ × ... × nₖ.

The theorem has applications in various fields including cryptography (RSA algorithm), coding theory, and computer science (particularly in distributed computing and for creating efficient algorithms). It also has historical significance, originating in ancient Chinese mathematics as early as the 3rd century CE in the mathematical text "Sunzi Suanjing."

Video explanation:
Chinese Remainder Theorem - A comprehensive explanation of the Chinese Remainder Theorem, its proof, and applications.

🔬 ChIP-seq (Chromatin Immunoprecipitation Sequencing)

biologylab-techniques

• Purpose: Maps protein-DNA interactions, such as where transcription factors or modified histones bind DNA.

• Data type: Enrichment peaks showing binding locations.

• Used for: Studying gene regulation, histone modifications, and epigenetic changes.

🔄 Chirality

biology

• Definition: The geometric property where a molecule cannot be superimposed on its mirror image, similar to how left and right hands are non-superimposable mirror images of each other.

• Key concepts:
- Chiral center: Typically a carbon atom bonded to four different groups
- Enantiomers: Mirror-image forms of a chiral molecule
- Optical activity: Chiral molecules rotate plane-polarized light
- Racemic mixture: Equal mixture of both enantiomers

• Biological importance:
- Enzyme specificity: Most enzymes interact with only one enantiomer of a substrate
- Drug efficacy and safety: Different enantiomers can have dramatically different biological effects
- Protein structure: All natural amino acids (except glycine) are chiral
- Nucleic acid structure: The sugar component in DNA and RNA is chiral

• Nomenclature systems:
- R/S system: Based on Cahn-Ingold-Prelog priority rules
- D/L system: Based on the configuration of glyceraldehyde
- (+)/(-) system: Based on the direction of rotation of plane-polarized light

• Applications in bioinformatics and computational chemistry:
- Molecular modeling: Accurate representation of 3D molecular structures
- Drug discovery: Virtual screening of specific enantiomers
- Protein-ligand interactions: Predicting binding affinities of chiral molecules
- Cheminformatics: Algorithms for detecting and representing chirality

🧪 Chromatin Remodeling

biology

• Definition: The dynamic process by which specialized protein complexes alter chromatin structure to regulate DNA accessibility for transcription, replication, repair, and recombination.

• Mechanisms:
- Nucleosome sliding (moving histone proteins): Repositioning nucleosomes along DNA without disrupting histone-DNA contacts
- Histone eviction/replacement: Removing or exchanging histones to alter chromatin composition
- Histone modification: Adding or removing chemical groups that affect chromatin compaction
- ATP-dependent remodeling: Using energy from ATP hydrolysis to physically restructure chromatin

Chromatin Structure showing nucleosomes, histones, and the difference between heterochromatin and euchromatin

Chromatin Structure showing nucleosomes, histones, and the difference between heterochromatin and euchromatin

• Major remodeling complex families:
- SWI/SNF family (BAF/PBAF): Nucleosome sliding and ejection
- ISWI family: Nucleosome spacing and assembly
- CHD family: Nucleosome sliding and histone deacetylation
- INO80/SWR1 family: Histone variant exchange (H2A.Z incorporation)

• Biological roles:
- Transcriptional regulation: Controlling gene expression by modulating promoter accessibility
- DNA replication: Ensuring replication machinery access to DNA
- DNA repair: Facilitating repair protein access to damaged DNA
- Development: Orchestrating cell fate decisions and differentiation

• Clinical significance:
- Cancer: Mutations in chromatin remodelers are frequent in many cancer types
- Developmental disorders: Associated with intellectual disability and congenital abnormalities
- Aging: Dysregulation of chromatin remodeling contributes to aging phenotypes
- Therapeutic targeting: Emerging strategies for modulating chromatin remodeling in disease

🔍 Chromodomains and Bromodomains

biology

• Definition: Specialized protein that recognize specific histone modifications and mediate chromatin-based processes.

• Chromodomains:
- Structure: ~60 amino acid modules that fold into a three-stranded anti-parallel β-sheet and an α-helix
- Recognition specificity: Primarily bind to methylated lysine residues on histone tails
- Key interactions: Form an aromatic cage that accommodates the methylated lysine
- Notable examples: HP1 (binds H3K9me3), Polycomb proteins (bind H3K27me3), CHD family proteins

• Bromodomains:
- Structure: ~110 amino acid modules consisting of four α-helices forming a hydrophobic pocket
- Recognition specificity: Recognize and bind to acetylated lysine residues on histone tails
- Key interactions: Hydrogen bonding and hydrophobic interactions with the acetyl-lysine
- Notable examples: BRD family proteins, TAF1, PCAF, BRG1/BRM (in SWI/SNF complexes)

• Biological functions:
- Epigenetic regulation: Translating histone modifications into functional outcomes
- Transcriptional control: Recruiting transcriptional machinery to specific chromatin regions
- : Directing remodeling complexes to appropriate genomic locations
- DNA repair: Facilitating access of repair machinery to damaged DNA

• Applications in research and medicine:
- Epigenetic inhibitors: Bromodomain inhibitors (BETi) as emerging cancer therapeutics
- Drug discovery: Structure-based design of small molecules targeting these
- Biomarkers: Expression patterns as diagnostic or prognostic indicators
- Synthetic biology: Engineered chromatin readers for targeted gene regulation

🧬 Cis DNA Regulatory Elements

biology

• Definition: Non-coding DNA sequences that control the transcription of nearby genes on the same chromosome by serving as binding sites for transcription factors and other regulatory proteins.

• Major types:
- Promoters: Core sequences located near transcription start sites that direct RNA polymerase binding and initiation
- Enhancers: Distal elements that increase transcription rates, often in a tissue-specific manner
- Silencers: Sequences that repress gene expression by binding negative regulatory factors
- Insulators: Boundary elements that block enhancer-promoter interactions or prevent heterochromatin spreading
- Response elements: Specific sequences that respond to environmental signals or cellular states

• Structural and functional characteristics:
- Contain specific DNA motifs recognized by transcription factors
- Can function over variable distances from target genes
- Often exhibit evolutionary conservation across species
- Frequently organized into clusters called cis-regulatory modules (CRMs)
- Can be tissue-specific, developmental stage-specific, or condition-responsive

• Identification methods:
- Comparative genomics: Identifying conserved non-coding sequences
- ChIP-seq: Mapping transcription factor binding sites genome-wide
- ATAC-seq: Identifying regions of open chromatin
- Reporter assays: Testing regulatory activity of candidate sequences
- Massively parallel reporter assays (MPRAs): High-throughput functional screening

• Biological significance:
- Orchestrate spatiotemporal gene expression patterns during development
- Mediate cellular responses to environmental stimuli
- Contribute to cell type-specific gene expression profiles
- Form the physical basis for gene regulatory networks

• Clinical and evolutionary relevance:
- Mutations in regulatory elements contribute to human disease
- Regulatory variation drives phenotypic diversity within and between species
- Therapeutic targeting of transcription factor-DNA interactions
- Synthetic biology applications in designing artificial gene circuits

👁️ Convolutional Neural Network (CNN)

machine-learning

A specialized architecture inspired by the visual cortex. It uses sliding filters to automatically learn and detect important features in grid-like data (especially images), making it powerful for tasks like facial recognition, object detection, and medical image analysis.

🌐 CORS (Cross-Origin Resource Sharing)

computer scienceweb developmentnetworking

A security mechanism implemented by web browsers that allows or restricts web pages from making requests to a different , protocol, or than the one serving the web page. CORS is a relaxation of the Same-Origin Policy, which by default blocks cross-origin requests for security reasons.

How CORS Works:

When a web application running on one (e.g., https://example.com) tries to access resources from another (e.g., https://api.service.com), the browser initiates a CORS check. For simple requests, the browser adds an Origin header and checks the response for appropriate CORS headers. For complex requests, the browser first sends a preflight request (OPTIONS method) to determine if the actual request is allowed.

Key CORS Headers:

- Access-Control-Allow-Origin: Specifies which origins can access the resource
- Access-Control-Allow-Methods: Lists allowed HTTP methods (GET, POST, PUT, etc.)
- Access-Control-Allow-Headers: Specifies allowed request headers
- Access-Control-Allow-Credentials: Indicates if credentials can be included
- Access-Control-Max-Age: Sets how long preflight responses can be cached

Common CORS Scenarios:

1. API Requests: Frontend applications calling REST APIs on different
2. CDN Resources: Loading fonts, images, or scripts from content delivery networks
3. Microservices: Services communicating across different
4. Third-party Integrations: widgets or accessing external services

CORS vs Same-Origin Policy:

The Same-Origin Policy is a fundamental security concept that restricts how documents or scripts from one origin can interact with resources from another origin. CORS provides a controlled way to relax this restriction, allowing servers to specify which cross-origin requests are permitted while maintaining security.

Security Considerations:

While CORS enables legitimate cross-origin requests, misconfiguration can create security vulnerabilities. Using wildcards (*) for Access-Control-Allow-Origin with credentials, or overly permissive CORS policies can expose applications to attacks. Proper CORS configuration should follow the principle of least privilege, only allowing necessary origins and methods.

📊 Cross Entropy Loss

machine-learningmath

A commonly used in classification problems, particularly for multi-class classification and . Cross entropy loss measures the difference between the predicted probability distribution and the true distribution (one-hot encoded labels). It penalizes confident wrong predictions more heavily than uncertain predictions.

Mathematically, for a single sample with true class y and predicted probabilities p, the cross entropy loss is:

L = -\sum_{i=1}^{C} y_i \log(p_i)

where C is the number of classes. For binary classification, this simplifies to:

machine-learning

A technique that converts discrete data (like words or categories) into dense vectors of continuous numbers. These learned representations capture semantic relationships and similarities, enabling AI models to process categorical data effectively. It's fundamental to modern NLP and recommendation systems.

🧬 Endogenous

dictionary

Refers to something that originates or is produced from within an organism, tissue, or cell

🧬 Epochs

machine-learning

In and , an epoch refers to one complete pass through the entire training dataset during the training process. During each epoch, the model sees every training example once and updates its parameters accordingly. Multiple epochs are typically required to train a model effectively, with the number of epochs being a hyperparameter that affects model performance and training time.

📐 Euler's Number (e)

math

A fundamental mathematical constant approximately equal to 2.71828, denoted by the letter 'e' in honor of the Swiss mathematician Leonhard Euler. It is defined as the limit of (1 + 1/n)ⁿ as n approaches infinity, or equivalently as the sum of the infinite series:

e = \sum_{n=0}^{\infty} \frac{1}{n!} = 1 + \frac{1}{1!} + \frac{1}{2!} + \frac{1}{3!} + \cdots

Euler's number is the base of the natural logarithm and appears naturally in many areas of mathematics, particularly in calculus where it serves as the unique number such that the derivative of eˣ equals eˣ itself. This property makes it invaluable for solving differential equations and modeling exponential growth and decay processes.

Key Applications:
- Compound Interest: Continuous compounding formula A = Pe^(rt)
- Population Growth: Exponential growth models in biology and demographics
- Radioactive Decay: Half-life calculations in physics and chemistry
- Probability Theory: Normal distribution and Poisson processes
- Signal Processing: Fourier transforms and complex analysis
- ****: Activation functions (sigmoid, ) and optimization algorithms
- Economics: Present value calculations and economic modeling

The constant e is irrational and transcendental, meaning it cannot be expressed as a simple fraction or as the root of any polynomial equation with rational coefficients. Its ubiquity in natural phenomena has earned it the designation as one of the most important mathematical constants alongside π.

🔢 Exponent Rules

math

A set of fundamental algebraic rules that govern operations with exponential expressions. These rules are essential for simplifying expressions, solving equations, and working with logarithms and exponential functions.

Basic Exponent Rules:
- Product Rule: $a^m \cdot a^n = a^{m+n}$
- ****: $\frac{a^m}{a^n} = a^{m-n}$ (where $a \neq 0$)
- Power Rule: $(a^m)^n = a^{mn}$
- Power of a Product: $(ab)^n = a^n b^n$
- Power of a Quotient: $\left(\frac{a}{b}\right)^n = \frac{a^n}{b^n}$ (where $b \neq 0$)
- Zero Exponent: $a^0 = 1$ (where $a \neq 0$)
- Negative Exponent: $a^{-n} = \frac{1}{a^n}$ (where $a \neq 0$)
- Fractional Exponent: $a^{\frac{m}{n}} = \sqrt[n]{a^m} = (\sqrt[n]{a})^m$

These rules form the foundation for working with exponential and logarithmic functions, compound interest calculations, scientific notation, and are extensively used in algebra, calculus, physics, chemistry, and computer science algorithms.

🎯 Fine-tuning

machine-learning

The process of taking a pre-trained model and adapting it to a specific task by training it on a smaller, task-specific dataset. This transfer learning approach saves computational resources and often yields better results than training from scratch.

🔢🔍 Float (Floating-Point Number)

computer sciencemath

A float, or floating-point number, is a data type used in computer programming to represent real numbers that can have a fractional part. Unlike integers, which represent whole numbers, floats can represent a wide range of values, including very small and very large numbers, as well as numbers with decimal points. Floating-point numbers are typically stored in a format defined by the IEEE 754 standard, which specifies how to represent the number using a sign bit, an exponent, and a significand (or mantissa). Common floating-point types include single-precision (usually 32-bit) and double-precision (usually 64-bit), offering different ranges and levels of precision. While versatile, floating-point arithmetic can introduce small inaccuracies due to the finite way real numbers are approximated, leading to potential rounding errors or loss of precision in calculations.

Reference: How floating point works - jan Misali

🔬📊 Floating-Point Precision

computer sciencemath

Floating-point precision refers to the number of significant digits that can be accurately represented by a floating-point data type. It determines how close the stored floating-point number can be to the true mathematical value. Precision is limited because computers store numbers in a finite number of bits. The IEEE 754 standard defines common formats like single-precision (float) and double-precision (double). Single-precision typically offers about 7 decimal digits of precision, while double-precision offers about 15-17 decimal digits.

What this means in practice is that calculations involving floating-point numbers may not always be exact. For example, representing 0.1 in binary floating-point is not perfectly accurate, similar to how 1/3 cannot be perfectly represented as a finite decimal. This can lead to:
- Rounding Errors: Small discrepancies that occur when a number is rounded to fit the available precision.
- Loss of Significance: When subtracting two nearly equal numbers, significant digits can be lost, leading to a result with much lower relative accuracy.
- Comparison Issues: Directly comparing two floating-point numbers for equality (e.g., `a == b`) can be unreliable due to these small precision differences. It's often better to check if their absolute difference is within a small tolerance (epsilon).

Understanding floating-point precision is crucial in scientific computing, financial calculations, and any where numerical accuracy is important, as ignoring these limitations can lead to incorrect results or unexpected behavior in programs.

Reference: How floating point works - jan Misali

🧬 Gene Fusion Events

biology

• Definition: Hybrid genes formed by the joining of two previously separate genes, typically resulting from chromosomal rearrangements such as translocations, inversions, or deletions.

• Formation mechanisms:
- Chromosomal translocations: Exchange of genetic material between non-homologous chromosomes
- Chromosomal inversions: Reversal of a DNA segment within a chromosome
- Tandem duplications: Duplication of a segment followed by fusion
- Transcription-mediated gene fusion: Read-through transcription between adjacent genes
- Trans-splicing: Joining of exons from separate pre-mRNA molecules

• Structural characteristics:
- Breakpoint junctions: Points where the two genes are joined
- Fusion : Protein contributed by each partner gene
- Reading frame: Determines if the fusion produces a functional protein
- Regulatory elements: Promoters and enhancers that control fusion gene expression

• Detection methods:
- RNA-seq with fusion-detection algorithms (STAR-Fusion, FusionCatcher)
- Whole genome sequencing to identify genomic breakpoints
- FISH (Fluorescence In Situ Hybridization) for known fusions
- RT-PCR with fusion-specific primers
- Mass spectrometry for fusion protein detection

• Biological and clinical significance:
- Oncogenic drivers in many cancer types (e.g., BCR-ABL in chronic myeloid leukemia)
- Diagnostic biomarkers for cancer classification
- Therapeutic targets for precision medicine approaches
- Evolutionary mechanism for new gene function
- Contribution to genetic diversity and adaptation

• Notable examples:
- BCR-ABL1 in chronic myeloid leukemia (Phil Philadelphia chromosome)
- EML4-ALK in non-small cell lung cancer
- TMPRSS2-ERG in prostate cancer
- PML-RARA in acute promyelocytic leukemia
- SYT-SSX in synovial sarcoma

🌐 Generalized CP (GCP) Decomposition

mathmachine-learning

An extension of the standard CP decomposition that incorporates different and constraints to handle various data types (binary, count, continuous) and missing values. GCP provides more flexibility for modeling complex real-world data with non-Gaussian characteristics.

In , GCP enables robust pattern discovery in heterogeneous multi-way data, supporting applications like topic modeling across document collections, community detection in dynamic networks, and analyzing sparse, noisy biological measurements across multiple experimental conditions.

🎨 Generative Adversarial Network (GAN)

machine-learning

An AI architecture where two networks compete: one creates fake data, while the other tries to distinguish real from fake. This competition drives both to improve, resulting in increasingly realistic synthetic data. GANs have revolutionized AI-generated art, deepfakes, and synthetic data generation.

🔍✅ Git Triage:

git

The process of reviewing, labeling, categorizing, and prioritizing issues and pull requests in a Git repository to ensure effective project management and workflow. Git triage helps maintain order, identify critical tasks, eliminate duplicates, and streamline collaboration among contributors.

⬇️ Gradient Descent

machine-learningmath

A fundamental optimization algorithm used to train models by iteratively adjusting parameters to minimize a . The algorithm computes the gradient (partial derivatives) of the with respect to each parameter and updates parameters in the direction opposite to the gradient, effectively moving downhill toward a minimum.

Mathematically, the update rule is:

biologylab-techniques

• Definition: Bifunctional molecules designed to induce by simultaneously binding to a protein of interest and an E3 ubiquitin ligase, bringing them into proximity to facilitate ubiquitination and subsequent proteasomal degradation of the target protein.

• Structural components:
- Target protein-binding ligand: Binds specifically to the protein of interest
- E3 ligase-binding ligand: Recruits an E3 ubiquitin ligase (e.g., CRBN, VHL, IAP, MDM2)
- Linker: Connects the two ligands and optimizes their spatial arrangement

• Mechanism of action:
- Target engagement: Binding to both the target protein and E3 ligase
- Ternary complex formation: Creation of a three-molecule complex (target-PROTAC-E3 ligase)
- Ubiquitination: Transfer of ubiquitin molecules to the target protein
- Proteasomal degradation: Recognition and degradation of the polyubiquitinated target
- Recycling: Release of the PROTAC for additional rounds of target degradation

• Advantages over traditional inhibitors:
- Event-driven pharmacology: Effect persists after drug clearance
- Catalytic mechanism: One PROTAC can facilitate degradation of multiple target proteins
- Broader target scope: Can address previously undruggable proteins
- Potential to overcome resistance: Complete protein removal versus functional inhibition
- Degradation of all protein functions: Not limited to active site inhibition

• Design considerations:
- E3 ligase selection: Tissue expression, binding affinity, and substrate compatibility
- Linker optimization: Length, composition, and flexibility
- Target ligand selection: Binding affinity, selectivity, and attachment point
- Ternary complex geometry: Spatial arrangement for optimal ubiquitination

• Clinical development status:
- Multiple candidates in clinical trials for various cancers
- ARV-110 (androgen receptor degrader) for prostate cancer
- ARV-471 (estrogen receptor degrader) for breast cancer
- DT2216 (BCL-xL degrader) for hematologic malignancies

• Challenges and limitations:
- High molecular weight: Potential issues with cell permeability and oral bioavailability
- Complex structure: Synthetic challenges and potential metabolic instability
- Hook effect: Decreased efficacy at high concentrations
- Tissue-specific E3 ligase expression: Potential limitations in tissue selectivity

🔗 Protocol/Scheme

networkingweb-development

The first part of a URL that specifies the communication protocol or method used to access a resource on the internet. Common protocols include HTTP (Hypertext Transfer Protocol), HTTPS (HTTP Secure), FTP (File Transfer Protocol), and others.

The protocol/scheme appears at the beginning of a URL followed by a colon and two forward slashes (://). For example, in 'https://www.example.com', 'https' is the protocol that indicates secure HTTP communication should be used.

Common Protocols:
- HTTP: Standard web protocol for transferring web pages
- HTTPS: Secure version of HTTP with encryption
- FTP: File Transfer Protocol for uploading/downloading files
- SMTP: Simple Mail Transfer Protocol for email
- SSH: Secure Shell for remote server access
- FILE: Local file system access

The protocol determines how the client (browser) will communicate with the server to retrieve the requested resource.

➗ Quotient Rule

math

A differentiation rule used to find the derivative of a function that is the quotient (division) of two other functions. If you have a function h(x) = f(x)/g(x), where both f(x) and g(x) are differentiable and g(x) ≠ 0, then the quotient rule provides the formula for h'(x).

The quotient rule formula is:

\frac{d}{dx}\left[\frac{f(x)}{g(x)}\right] = \frac{f'(x) \cdot g(x) - f(x) \cdot g'(x)}{[g(x)]^2}

Often remembered by the mnemonic "low d-high minus high d-low, over low squared" where "high" refers to the numerator function and "low" refers to the denominator function. This rule is particularly useful in calculus for differentiating rational functions, rates of change problems, and optimization problems involving ratios.

🔄 Recurrent Neural Network (RNN)

machine-learning

A designed for sequential data that maintains a "memory" of previous inputs. Like having a short-term memory, it processes information in order and uses past context to understand current inputs, making it suitable for tasks like text prediction and time series analysis.

🎮 Reinforcement Learning

machine-learning

A type of where an agent learns to make decisions by interacting with an environment. The agent receives rewards or penalties based on its actions, similar to how humans learn through trial and error. It's crucial for robotics, game AI, and autonomous systems.

⚡ ReLU (Rectified Linear Unit)

machine-learningmath

A widely-used activation function in that outputs the input directly if it's positive, otherwise it outputs zero. Mathematically defined as f(x) = max(0, x), ReLU is simple yet effective at introducing non-linearity into while being computationally efficient. It helps solve the vanishing gradient problem that plagued earlier activation functions like sigmoid and tanh, allowing for faster training of deep networks. ReLU has become the default activation function for hidden layers in most modern architectures, though variants like Leaky ReLU and ELU address some of its limitations, such as the "dying ReLU" problem where neurons can become permanently inactive.

🔄🚀 Repository Dispatch:

git

A GitHub feature that enables external events to trigger GitHub Actions workflows in a repository. It allows custom webhook events to initiate automated processes, facilitating integration with external services and enabling programmatic workflow execution. Repository dispatch is commonly used for creating custom automation triggers and coordinating workflows across multiple repositories.

🌐 RESTful API

computer sciencenetworkingweb development

An architectural style for designing web services that follows REST (Representational State Transfer) principles. RESTful APIs use standard HTTP methods to perform operations on resources identified by URLs. They are stateless, meaning each request contains all necessary information, and typically return data in JSON format.

HTTP Methods:

GET: Retrieves data from a resource without modifying it. Safe and idempotent operation used for reading data.
- Example: `GET /users/123` - Retrieve user with ID 123

POST: Creates new resources or submits data for processing. Not idempotent, as multiple requests may create multiple resources.
- Example: `POST /users` - Create a new user

PUT: Updates or replaces an entire resource. Idempotent operation that overwrites the complete resource.
- Example: `PUT /users/123` - Replace user 123 with new data

PATCH: Partially updates a resource by modifying only specified fields. More efficient than PUT for small changes.
- Example: `PATCH /users/123` - Update only specific fields of user 123

DELETE: Removes a resource from the server. Idempotent operation for resource deletion.
- Example: `DELETE /users/123` - Delete user with ID 123

HEAD: Retrieves only the headers of a resource without the body content. Useful for checking resource existence or metadata.
- Example: `HEAD /users/123` - Check if user 123 exists

Key Characteristics:

Resource-based: Everything is treated as a resource with a unique URL
Stateless: Each request is independent and contains all needed information
Cacheable: Responses can be cached to improve performance
Uniform Interface: Consistent way of interacting with resources
Layered System: Can include intermediary layers like proxies and gateways

RESTful APIs are widely used for web applications, mobile apps, and microservices due to their simplicity, scalability, and compatibility with web standards.

📈 Ridge Regression

mathmachine-learning

A regularization technique that addresses multicollinearity in linear regression by adding an L2 penalty term to the cost function. Unlike Lasso, Ridge Regression shrinks coefficients toward zero but rarely sets them exactly to zero, keeping all features in the model while reducing their impact. This approach is particularly effective when dealing with highly correlated predictors, preventing the model from assigning excessive importance to any single variable.

Ridge Regression excels in scenarios where all features contribute to the outcome but need to be constrained to prevent , such as in economic forecasting, climate modeling, and biomedical research.

🧬 RNA-seq (RNA sequencing)

biologylab-techniques

• Purpose: Measures gene expression — tells you which genes are being transcribed into RNA, and how much.

• Data type: Sequencing reads aligned to genes/transcripts.

• Used for: Identifying differentially expressed genes between conditions (e.g., normal vs. cancer cells).

🧬 sgRNA (Single Guide RNA)

biologylab-techniques

• Definition: A synthetic RNA molecule that combines the functions of crRNA (CRISPR RNA) and tracrRNA (trans-activating crRNA) into a single structure, used to guide Cas nucleases to specific DNA targets in CRISPR-Cas genome editing systems.

• Structure and components:
- Spacer sequence (20 nucleotides): Complementary to the target DNA sequence
- Scaffold sequence (~80 nucleotides): Forms secondary structures necessary for Cas protein binding
- PAM (Protospacer Adjacent Motif): DNA sequence required for target recognition (not part of sgRNA but essential for targeting)

• Design considerations:
- Target specificity: Minimizing off-target effects through careful sequence selection
- GC content: Optimal range of 40-60% for efficient binding
- Secondary structure: Avoiding self-complementarity that could interfere with target binding
- Position effects: Targeting the beginning of genes or critical functional
- PAM proximity: Selecting targets with appropriate PAM sequences for the Cas variant used

• Bioinformatic tools for sgRNA design:
- CHOPCHOP: Web tool for CRISPR/Cas9 target prediction and off-target evaluation
- CRISPOR: Comprehensive tool for guide selection and off-target prediction
- E-CRISP: Design tool with evaluation of on-target efficiency and off-target effects
- Cas-Designer: Tool for designing guide RNAs for various CRISPR systems
- CRISPRscan: Algorithm for predicting sgRNA efficiency in vivo

• Applications in genome editing:
- Gene knockout: Introducing frameshift mutations through NHEJ repair
- Gene knock-in: Precise sequence insertion via HDR pathway
- Base editing: Creating specific nucleotide changes without double-strand breaks
- Epigenetic modification: Targeting chromatin modifiers to specific genomic loci
- Transcriptional regulation: Activating or repressing gene expression (CRISPRa/CRISPRi)
- Multiplexed editing: Simultaneous modification of multiple genomic targets

• Computational challenges:
- Off-target prediction: Algorithms to identify potential unintended targets
- Efficiency prediction: models to estimate editing efficiency
- Repair outcome prediction: Predicting the spectrum of editing outcomes
- Visualization tools: Representing complex genomic targeting information
- Data integration: Combining sgRNA design with functional genomics data

🔒🧮 SHA-256 Checksum

computer science

SHA-256 (Secure Hash Algorithm 256-bit) is a cryptographic hash function that generates a fixed-size 256-bit (32-byte) hash value, typically rendered as a 64-character hexadecimal string. It belongs to the SHA-2 family of hash functions, designed by the National Security Agency (NSA). SHA-256 is widely used for data integrity verification through checksums, which allow users to verify that a file or message hasn't been altered during transmission or storage. It possesses several critical properties: it's deterministic (the same input always produces the same output), fast to compute, designed to be collision-resistant (extremely difficult to find two different inputs that produce the same hash), and exhibits the avalanche effect (a small change in input drastically changes the output). SHA-256 is extensively used in digital signatures, blockchain technology (particularly Bitcoin), password storage, SSL/TLS certificates, and file verification systems. Unlike encryption, SHA-256 is a one-way function, meaning it's computationally infeasible to reverse the process and derive the original input from the hash value.

🔗 Similarity Network Fusion (SNF)

machine-learningcomputer science

• Definition: A computational method that integrates multiple data types to create a unified patient similarity network, enabling more comprehensive analysis than single-data approaches.

• Algorithm principles:
- Constructs similarity networks for each data type separately
- Iteratively updates each network by fusing information from other networks
- Converges to a single integrated network that captures complementary information across data types
- Uses spectral clustering for patient stratification and subtype identification

• Applications in multi-omics integration:
- Cancer subtyping: Identifying disease subtypes by integrating genomic, transcriptomic, and clinical data
- Biomarker discovery: Finding robust biomarkers across multiple data platforms
- Patient stratification: Grouping patients with similar molecular profiles across different data types
- Drug response prediction: Integrating molecular and pharmacological data to predict treatment outcomes

• Advantages over single-data analysis:
- Increased statistical power through data integration
- Robustness to noise in individual data types
- Ability to capture complementary information across heterogeneous data
- Improved prediction accuracy for clinical outcomes

• Implementation considerations:
- Parameter selection (number of neighbors, fusion iterations)
- Data normalization across different platforms
- Computational efficiency for large datasets
- Visualization of integrated networks

🔓 Small Subgroup Vulnerabilities

computer sciencemath

A cryptographic weakness that can occur in implementations of protocols using discrete logarithm-based cryptography, particularly Diffie-Hellman key exchange. These vulnerabilities arise when an attacker forces computations into a small subgroup of the larger cryptographic group, making it feasible to determine the private key through brute force methods.

Small subgroup attacks exploit improper parameter validation, specifically when implementations fail to verify that received public keys are members of the correct cryptographic group of appropriate order. By sending carefully crafted invalid public values, attackers can extract information about the victim's private key through multiple protocol interactions. Proper implementation requires validation of all public keys and the use of safe primes or prime order subgroups to mitigate these vulnerabilities.

📊 Softmax

machine-learningmath

A mathematical function that converts a vector of real numbers into a probability distribution, where each output value is between 0 and 1 and all outputs sum to 1. Softmax is commonly used as the final activation function in multi-class classification problems, transforming raw model outputs (logits) into interpretable probabilities for each class.

Mathematically defined as:

lab-techniquesbiology

Definition
Western blotting (immunoblotting) is a powerful analytical technique used in molecular biology and proteomics to detect, identify, and semi-quantify specific proteins within a complex mixture of proteins extracted from cells or tissues 1. The technique derives its name from its position in the "blotting" family, following Southern blotting (for DNA) and Northern blotting (for RNA) 1.

Principle
Western blotting uses three key elements to accomplish protein detection 2:
1. Separation by size: Proteins are separated based on molecular weight through gel electrophoresis
2. Transfer to solid support: Separated proteins are transferred to a protein-binding membrane
3. Immunodetection: Target proteins are marked using specific primary and secondary antibodies for visualization

Protocol
The Western blot procedure typically involves the following steps 2 4:

1. Sample preparation: Proteins are extracted from cells or tissues using mechanical disruption, chemical extraction, or enzymatic methods
2. Gel electrophoresis: Proteins are separated based on molecular weight using polyacrylamide gel electrophoresis (PAGE), typically with sodium dodecyl sulfate (SDS-PAGE)
3. Protein transfer: Separated proteins are transferred from the gel to a membrane (nitrocellulose or PVDF) using electrophoresis (electroblotting)
4. Blocking: The membrane is blocked with a protein solution (often non-fat dry milk or BSA) to prevent non-specific antibody binding
5. Primary antibody incubation: The membrane is incubated with a primary antibody specific to the target protein
6. Washing: Unbound primary antibodies are washed away
7. Secondary antibody incubation: The membrane is incubated with a labeled secondary antibody that binds to the primary antibody
8. Detection: The protein-antibody complex is visualized using various detection methods

Detection Methods
Several detection methods can be used in Western blotting 1 4:

1. Chromogenic detection: Uses enzyme-conjugated secondary antibodies (like HRP or AP) that produce a colored precipitate when exposed to a substrate
2. Chemiluminescent detection: Uses enzyme-conjugated antibodies that produce light when exposed to a substrate, captured on film or by digital imaging systems
3. Fluorescent detection: Uses fluorophore-conjugated antibodies that emit light when excited at specific wavelengths
4. Radioactive detection: Historically used radioactive probes and autoradiography film (less common today)

Applications
Western blotting has numerous applications in research and clinical settings 5 3:

1. Protein identification and characterization: Detecting specific proteins in complex mixtures
2. Post-translational modifications: Identifying modifications like phosphorylation, glycosylation, and ubiquitination
3. Protein expression analysis: Measuring relative protein levels in different samples
4. Clinical diagnostics: Confirming diseases like HIV through antibody detection
5. Drug development: Evaluating protein targets and drug effects
6. Epitope mapping: Identifying antibody binding sites on proteins
7. Anti-doping testing: Detecting prohibited substances in sports

Advantages
1. Requires only small amounts of reagents 5
2. High specificity due to antibody-antigen interactions 4
3. Provides information about protein molecular weight 1
4. Same protein transfer can be used for multiple analyses 5
5. Can detect proteins at picogram levels with optimized protocols

Limitations
1. Requires specific antibodies for target proteins 5
2. Potential for antibody cross-reactivity and off-target effects 5
3. Time-consuming procedure (typically 1-2 days) 1
4. Semi-quantitative rather than fully quantitative 1
5. Relatively costly due to antibody expenses and detection reagents 5

📝 Word Embedding

machine-learning

A specific type of that maps words to vectors of real numbers, capturing semantic relationships between words. Similar words cluster together in the space, allowing models to understand word meanings and relationships. Examples include Word2Vec and GloVe.

🧬 Xenologs

biology

• Definition: Homologous genes acquired through horizontal gene transfer (HGT) between different species rather than through vertical inheritance.

• Key characteristics:
- Phylogenetic incongruence with species tree
- Often have nucleotide composition or codon usage distinct from host genome
- May be flanked by mobile genetic elements
- Frequently confer novel adaptive functions

• Common mechanisms of transfer:
- Transformation: Uptake of environmental DNA
- Conjugation: Direct cell-to-cell transfer
- Transduction: Virus-mediated transfer
- Endosymbiotic gene transfer: From organelles to nucleus

• Prevalence across life:
- Common in prokaryotes (bacteria and archaea)
- Less frequent but significant in unicellular eukaryotes
- Rare but documented in multicellular eukaryotes
- Extensive in certain lineages (e.g., bdelloid rotifers)

• Biological and evolutionary significance:
- Rapid acquisition of adaptive traits (e.g., antibiotic resistance)
- Metabolic innovation and niche expansion
- Acceleration of evolutionary change
- Complication of phylogenetic reconstruction

• Applications:
- Tracking antibiotic resistance spread
- Identifying potential bioremediation genes
- Understanding microbial genome evolution
- Developing novel antimicrobial strategies

🧬 Zinc Fingers

biology

• Definition: Protein structural motifs that coordinate one or more zinc ions to stabilize their fold and facilitate interactions with other molecules, particularly DNA, RNA, or proteins.

• Structure and classification:
- Classical C2H2 zinc fingers: Contain two cysteines and two histidines that coordinate a zinc ion
- C4 zinc fingers: Four cysteine residues coordinate the zinc ion (e.g., nuclear hormone receptors)
- C3H zinc fingers: Three cysteines and one histidine coordinate the zinc ion
- RING finger : Cross-brace arrangement of cysteines and histidines coordinating two zinc ions

• Biological functions:
- Transcription regulation: Binding to specific DNA sequences to control gene expression
- RNA binding: Recognizing specific RNA structures or sequences
- Protein-protein interactions: Mediating interactions between proteins in cellular processes
- : Contributing to changes in chromatin structure and accessibility

• Applications in biotechnology:
- Zinc finger nucleases (ZFNs): Engineered proteins combining zinc finger with nuclease activity for genome editing
- Artificial transcription factors: Custom-designed zinc fingers fused to regulatory
- Protein engineering: Creating novel binding specificities for research and therapeutic applications
- Diagnostic tools: Zinc finger-based probes for detecting specific nucleic acid sequences

• Clinical relevance:
- Cancer biology: Mutations in zinc finger proteins associated with various cancers
- Developmental disorders: Defects in zinc finger proteins linked to congenital abnormalities
- Therapeutic targets: Potential for targeting disease-associated zinc finger proteins
- Gene therapy: ZFN-based approaches for correcting genetic mutations
- Circular dichroism: Measuring optical activity
- NMR spectroscopy: Distinguishing based on chemical environment
- Computational methods: Predicting and analyzing stereochemical properties

• Applications in bioinformatics:
- Molecular docking: Accounting for stereochemistry in protein-ligand interactions
- Structure-based drug design: Optimizing stereochemistry for target binding
- Conformational analysis: Predicting energetically favorable
- Cheminformatics: Representing and searching stereochemical information in databases

![Zinc finger protein structure showing DNA binding ](/img/zinc_finger.png)