A

Alex Ziskind

18 videos tracked Since Mar 2026 Alex presents his content with a blend of technical detail and accessible explanations, making complex topics understandable to a wide audience. YouTube Channel
AI Hardware Performance Developer Hardware Mini PC

About This Creator

Alex Ziskind is a tech enthusiast and YouTuber who specializes in AI hardware performance, developer hardware, and mini PCs. He delves into topics like local LLMs, Apple Silicon, and benchmarking GPU efficiency, offering insights that cater to both tech enthusiasts and professionals. His style is engaging and informative, often presenting complex topics in an accessible manner.

Messaging Evolution

How this creator's focus and perspective has shifted over time

2026-01

General hardware reviews and benchmarks

2026-02

Apple Silicon and local LLMs

2026-03

Advanced AI performance and benchmarking

Over time, Alex's content has evolved from a focus on general hardware reviews to more specialized topics like local LLMs and GPU efficiency. His recent videos highlight his deep dive into Apple Silicon, benchmarking, and innovative solutions for AI performance.

Video Timeline

Every video we've tracked from this creator, newest first

March 2026

TUTORIAL intermediate

NVIDIA didn't want me to do this

This video discusses the successful setup of a four-node NVIDIA cluster, highlighting both operational success and potential vendor restrictions.

"I've got a cluster of four of these working together."
NVIDIA Hardware AI Cluster Setup System Configuration
Key Takeaways
  • Multi-node clustering is achievable outside standard configurations
  • There are both advantages and disadvantages to this specific setup
  • Vendor policies may discourage certain hardware combinations
NEWS

I Ran Claude Code for FREE… Here's How

Cloud code have very popular tool can now use local models. In other words, I can be running an LLM right on my laptop, and Cloud code ...

FRESH_TAKE intermediate

Skip M3 Ultra & RTX 5090 for LLMs | NEW 96GB KING

This video compares high-end graphics cards for running Large Language Models, arguing that the new RTX Pro 6000 with 96GB VRAM is a superior choice over the M3 Ultra and RTX 5090.

"This is the brand new RTX Pro 6000, and it's heavier than it looks."
LLM Hardware GPU Comparison RTX Pro 6000
Key Takeaways
  • The RTX Pro 6000 offers a significant VRAM advantage for local LLM inference
  • Consumer flagship GPUs may not be the best value for memory-intensive AI tasks
  • Professional workstation cards provide better capacity for large model deployment
NEW_RELEASE beginner

Nvidia, You’re Late. World’s First 128GB LLM Mini Is Here!

This video reviews GMK Tech's new EVO X2 Mini PC, highlighting its groundbreaking 128GB RAM capacity designed for running local large language models.

"DJX Park will be ready, will be available shortly, probably in a few weeks."
Mini PC Local LLM Hardware Review GMK Tech
Key Takeaways
  • 128GB RAM enables local AI model execution on compact hardware
  • GMK Tech challenges Nvidia's dominance in AI hardware
  • Product availability expected within a few weeks
TUTORIAL intermediate

Your Local LLM Is 3x Slower Than It Should Be

This video tutorial demonstrates how to significantly improve local LLM inference speed using a draft model approach, specifically targeting performance bottlenecks.

"Alright, you're gonna like this. Watch this."
Local LLM Optimization Speculative Decoding Draft Model Techniques
Key Takeaways
  • Local LLMs can be optimized to run 3x faster than default settings
  • Draft model approaches reduce inference latency significantly
  • Quantization plays a role in balancing speed and accuracy
NEWS

John Carmack Was Right. The Internet Was Wrong.

The DJX Spark isn't the only player in town when it comes to having one petaflop of AI supercomputer on your desk. There's a bunch of t...

NEW_RELEASE intermediate

I Ran a Trillion Parameter AI on a Mac... Here’s the Secret

This video demonstrates the feasibility of running a massive 1 trillion parameter AI model locally on a Mac computer, highlighting recent advancements in model efficiency.

"KK 2.5 is out and it's the new big hot model."
Artificial Intelligence Mac Hardware Model Optimization
Key Takeaways
  • Consumer hardware can support massive AI models
  • KK 2.5 is a significant new release
  • Local inference is becoming more accessible
NEWS

Stop Paying for AI Video... Download This Instead (low VRAM)

You might have heard that open video is here and it's the first time an actual open waste release shows up with the whole stack. The mo...

TUTORIAL intermediate

Private AI Framework Cluster… FIXED

This video documents the process of upgrading and resolving issues within a self-hosted private AI framework cluster to enhance performance.

"I made some upgrades to my cluster."
Private AI Infrastructure Cluster Management System Optimization
Key Takeaways
  • Clustering capable machines enables scalable resource utilization for AI workloads
  • Regular upgrades are necessary to maintain stability in private AI frameworks
  • Self-hosting ensures data privacy while leveraging distributed computing power
TUTORIAL intermediate

Your local LLM is 10x slower than it should be

This video explores why local LLM inference often underperforms and provides actionable steps to optimize speed by adjusting backend configurations and hardware utilization.

"You're probably familiar with Olama, right? This is what it looks like, you can launch it, you can talk to it."
Local LLMs Performance Optimization Inference Speed
Key Takeaways
  • Default settings often limit hardware potential
  • Backend choice significantly impacts throughput
  • Quantization can balance speed and quality
NEW_RELEASE intermediate

Dev Workloads and LLMs… under $1000

This video reviews the GeekOM A9 Max mini PC, evaluating its performance for developer workloads and local LLMs while highlighting its sub-$1000 price point.

"The AMD Strix Point chip is finally in a mini PC that's under $1000 in this GeekOM A9 Max that many of you have been asking about."
Mini PC AMD Strix Point Local LLMs Developer Hardware
Key Takeaways
  • AMD Strix Point chips are now accessible in affordable mini PCs
  • The GeekOM A9 Max offers viable performance for dev workloads under $1000
  • Local LLM inference is becoming more feasible on budget hardware
FRESH_TAKE intermediate

Wait, Spark IS faster!

This video challenges the common perception that NVIDIA DGX Spark systems are slow when running AI models, arguing that standard benchmarks may not reflect real-world performance.

"What if I told you that what you've seen so far about the DJX Sparking's slower is wrong?"
AI Hardware Performance DGX Spark Benchmarks GPU Efficiency
Key Takeaways
  • Single-user benchmarks may misrepresent actual AI workload speed
  • DGX Spark outperforms expectations in specific contexts
  • Perceived slowness is often a benchmarking artifact
NEWS

The Cheap Model Got Smoked… Until I Checked the Output

I have a course site that I've had for many years and I need to migrate away from my current provider. Unfortunately, because their pri...

OPINION intermediate

Dev laptop shopping…Safe bet, then and now

This video reviews developer laptop options, comparing historical safe bets with current hardware suitable for local AI inference versus traditional coding workflows.

"I feel like there's two types of developers right now."
Laptop Shopping Developer Hardware Local AI Inference
Key Takeaways
  • Developers now split between local AI needs and traditional coding
  • Hardware selection depends on specific workflow requirements
  • Proven configurations often remain the safest choice
FRESH_TAKE intermediate

Fastest 1,000,000 tokens… and who paid the most

This video benchmarks different machines to determine which can generate 1 million tokens the fastest, comparing performance and costs.

"I wanted to answer one simple question. Which of these machines can generate 1 million tokens the fastest?"
LLM Benchmarking Token Generation Speed AI Hardware Performance
Key Takeaways
  • Hardware choice significantly impacts token generation speed
  • Cost efficiency is a critical factor alongside raw speed
  • Large-scale testing reveals real-world performance differences
NEW_RELEASE intermediate

Apple JUST Dropped a Game-Changer

Alex Ziskind discusses a new Apple update that potentially solves historical issues with Mac cluster performance and longevity.

"Every Mac cluster I've built in the past has had the same painful ending considerably worse."
Apple Silicon Mac Cluster Hardware Updates
Key Takeaways
  • Historical Mac clusters often faced painful endings
  • New updates aim to improve cluster stability and performance
  • Significant changes in Apple computing technology are emerging
TUTORIAL

Your Mac Has Hidden VRAM… Here's How to Unlock It

If your Apple Silicon machine like Macbook or Little Mac Mini doesn't have much RAM and you still want to run large language models that are decently sized, you can dow...

Other Creators We Track

Back to What Our Team Is Watching