Signed archive

Lena Ortiz

Staff Writer

Lena tracks the economics and mechanics behind AI systems, from serving architecture and open-weight deployment to developer tooling, platform shifts, product decisions, and the operational tradeoffs that shape what teams actually run. Her reporting is aimed at builders and operators deciding what to trust, adopt, and maintain.

BerlinOperating leverage beats ideological posturing.Former platform PM with a habit of reading infra launch notes end to end.

Back to all authors

Latest story

Perplexity's Incognito Mode didn't hide anything

A class-action suit alleges Perplexity piped millions of AI chat transcripts to Meta and Google through hidden trackers. Incognito Mode did nothing to stop it.

April 10, 2026

Read the latest story

Published stories: 24
Latest story: Apr 10, 2026
Categories covered: 3

Coverage signature

If the cost curve moves, the product strategy moves with it.

Technical, commercial, and grounded in constraints.

Coverage lanes

Inference economicsOpen ecosystemsDeveloper workflowsLatency and cost

open-weight inference economicsAI tools workflowGPU deployment strategy

Archive signal

AI Infrastructure · 11 Open Source AI · 11 AI Policy · 2

Published stories

Everything currently attached to this byline.

Gateway beta/Mar 25, 2026/Updated Apr 11, 2026/4 min read

OpenClaw beta gateway becomes OpenAI-compatible

OpenClaw 2026.3.24-beta.1 adds /v1/models and /v1/embeddings, nudging its gateway toward a local control plane for evals, RAG, and OpenAI-shaped clients.

Lena OrtizStaff Writer

A local gateway console and agent map rendered as an OpenAI-style control surface, with requests flowing through one agent to models, tools, and embeddings behind the scenes.

OpenClaw security/Mar 25, 2026/Updated Apr 11, 2026/4 min read

DefenseClaw shows OpenClaw has entered its security era

Cisco's DefenseClaw arrives just after NVIDIA's NemoClaw and a run of real OpenClaw attacks, turning agent security from a side note into the market forming around the platform.

Lena OrtizStaff Writer

An editorial illustration of an always-on AI agent console surrounded by a sandbox boundary, security scanners, and live telemetry panels.

AI Infrastructure/Mar 24, 2026/Updated Apr 11, 2026/4 min read

NVIDIA Dynamo is the orchestration layer above vLLM

NVIDIA Dynamo matters because it sits above vLLM, SGLang, and TensorRT-LLM to coordinate routing, KV reuse, disaggregated serving, and scaling across GPU fleets.

Lena OrtizStaff Writer

Editorial illustration of a distributed inference control layer sitting above multiple model-serving engines, routing requests and KV cache between GPU pools.

AI Infrastructure/Mar 24, 2026/Updated Apr 11, 2026/4 min read

vLLM 0.18.0 points to a split multimodal stack

vLLM 0.18.0 signals a split multimodal serving stack, with render, transport, and GPU inference starting to separate into cleaner infrastructure tiers.

Lena OrtizStaff Writer

Editorial illustration of a multimodal serving stack split across a CPU render tier, a transport layer, and separate GPU inference racks instead of one monolithic serving box.

Open Source AI/Mar 23, 2026/Updated Apr 11, 2026/4 min read

NVIDIA OpenShell makes agent security infrastructure

OpenShell matters less as another framework than as a control plane that moves policy, sandboxing, and model routing outside the agent’s reach.

Lena OrtizStaff Writer

Editorial illustration of isolated agent workspaces running inside tab-like sandboxes while an external runtime policy layer governs file access, network routes, and model routing.

Open Source AI/Mar 23, 2026/Updated Apr 11, 2026/4 min read

vLLM Triton backend makes AMD more credible

vLLM's Triton and ROCm attention work points to a new inference contest: portable backends that can make AMD and other non-NVIDIA stacks credible in production.

Lena OrtizStaff Writer

Editorial illustration of a portable attention layer spanning several GPU rack lanes, with one AMD ROCm path showing extra acceleration inside the shared inference stack.

Open Source AI/Mar 23, 2026/Updated Apr 11, 2026/4 min read

Open-source security funding becomes AI defense

The Linux Foundation’s $12.5 million coalition shows AI labs now need open source maintainers to handle a rising flood of AI-generated security findings.

Lena OrtizStaff Writer

Editorial illustration of an AI software stack resting on an open source maintainer layer as security findings pour downward and funding flows back up.

AI Infrastructure/Mar 22, 2026/Updated Apr 11, 2026/4 min read

FlashAttention-4 turns Blackwell kernels into economics

FlashAttention-4 shows Blackwell-era AI economics will be shaped by attention kernel optimization and non-tensor bottlenecks, not FLOPs headlines alone.

Lena OrtizStaff Writer

Editorial illustration of a Blackwell server aisle where wide tensor-compute lanes narrow into shared-memory and softmax bottlenecks before a tuned attention pipeline opens the flow again.

AI Infrastructure/Mar 21, 2026/Updated Apr 11, 2026/4 min read

Meta's custom silicon push is an inference power play

Meta's MTIA roadmap and its 6GW AMD pact point to the same goal: cheaper inference, more control, and less life spent waiting on one supplier's clock.

Lena OrtizStaff Writer

Editorial illustration of a hyperscale data-center aisle with custom inference accelerator racks facing merchant GPU racks, signaling Meta’s mixed-silicon strategy and greater control over inference costs.

AI Infrastructure/Mar 20, 2026/Updated Apr 11, 2026/5 min read

NVIDIA AI grids turn telcos into inference resellers

NVIDIA's AI grid pitch is a bet that telecom networks can sell distributed inference, but only if operators package it like a product and not a committee.

Lena OrtizStaff Writer

Editorial illustration of a telecom tower radiating distributed inference lanes across nearby edge sites, roads, devices, and city infrastructure.

Open Source AI/Mar 13, 2026/Updated Apr 11, 2026/5 min read

Open-weight model inference economics for lean teams

I only get excited about open-weight inference when utilization, latency, privacy, and ops discipline line up. Sticker price alone is the decoy menu.

Lena OrtizStaff Writer

Editorial illustration of a serving stack with model weights, GPU capacity, utilization lines, and cost panels arranged across a dark model-serving grid.