Alex Ziskind

Alex Ziskind

18 videos tracked Since Mar 2026 Alex presents his content with a blend of technical detail and accessible explanations, making complex topics understandable to a wide audience. YouTube Channel

AI Hardware Performance Developer Hardware Mini PC

About This Creator

Alex Ziskind is a tech enthusiast and YouTuber who specializes in AI hardware performance, developer hardware, and mini PCs. He delves into topics like local LLMs, Apple Silicon, and benchmarking GPU efficiency, offering insights that cater to both tech enthusiasts and professionals. His style is engaging and informative, often presenting complex topics in an accessible manner.

Messaging Evolution

How this creator's focus and perspective has shifted over time

2026-01

General hardware reviews and benchmarks

2026-02

Apple Silicon and local LLMs

2026-03

Advanced AI performance and benchmarking

Over time, Alex's content has evolved from a focus on general hardware reviews to more specialized topics like local LLMs and GPU efficiency. His recent videos highlight his deep dive into Apple Silicon, benchmarking, and innovative solutions for AI performance.

Video Timeline

Every video we've tracked from this creator, newest first

March 2026

TUTORIAL intermediate

NVIDIA didn't want me to do this

This video discusses the successful setup of a four-node NVIDIA cluster, highlighting both operational success and potential vendor restrictions.

"I've got a cluster of four of these working together."

NVIDIA Hardware AI Cluster Setup System Configuration

Key Takeaways

Multi-node clustering is achievable outside standard configurations
There are both advantages and disadvantages to this specific setup
Vendor policies may discourage certain hardware combinations

NEWS

I Ran Claude Code for FREE… Here's How

Cloud code have very popular tool can now use local models. In other words, I can be running an LLM right on my laptop, and Cloud code ...

NEWS

4 screens 1 mouse | MEMBERS

FRESH_TAKE intermediate

Skip M3 Ultra & RTX 5090 for LLMs | NEW 96GB KING

This video compares high-end graphics cards for running Large Language Models, arguing that the new RTX Pro 6000 with 96GB VRAM is a superior choice over the M3 Ultra and RTX 5090.

"This is the brand new RTX Pro 6000, and it's heavier than it looks."

LLM Hardware GPU Comparison RTX Pro 6000

Key Takeaways

The RTX Pro 6000 offers a significant VRAM advantage for local LLM inference
Consumer flagship GPUs may not be the best value for memory-intensive AI tasks
Professional workstation cards provide better capacity for large model deployment

NEW_RELEASE beginner

Nvidia, You’re Late. World’s First 128GB LLM Mini Is Here!

This video reviews GMK Tech's new EVO X2 Mini PC, highlighting its groundbreaking 128GB RAM capacity designed for running local large language models.

"DJX Park will be ready, will be available shortly, probably in a few weeks."

Mini PC Local LLM Hardware Review GMK Tech

Key Takeaways

128GB RAM enables local AI model execution on compact hardware
GMK Tech challenges Nvidia's dominance in AI hardware
Product availability expected within a few weeks

TUTORIAL intermediate

Your Local LLM Is 3x Slower Than It Should Be

This video tutorial demonstrates how to significantly improve local LLM inference speed using a draft model approach, specifically targeting performance bottlenecks.

"Alright, you're gonna like this. Watch this."

Local LLM Optimization Speculative Decoding Draft Model Techniques

Key Takeaways

Local LLMs can be optimized to run 3x faster than default settings
Draft model approaches reduce inference latency significantly
Quantization plays a role in balancing speed and accuracy

NEWS

John Carmack Was Right. The Internet Was Wrong.

The DJX Spark isn't the only player in town when it comes to having one petaflop of AI supercomputer on your desk. There's a bunch of t...

NEW_RELEASE intermediate

I Ran a Trillion Parameter AI on a Mac... Here’s the Secret

This video demonstrates the feasibility of running a massive 1 trillion parameter AI model locally on a Mac computer, highlighting recent advancements in model efficiency.

"KK 2.5 is out and it's the new big hot model."

Artificial Intelligence Mac Hardware Model Optimization

Key Takeaways

Consumer hardware can support massive AI models
KK 2.5 is a significant new release
Local inference is becoming more accessible

NEWS

Stop Paying for AI Video... Download This Instead (low VRAM)

You might have heard that open video is here and it's the first time an actual open waste release shows up with the whole stack. The mo...

TUTORIAL intermediate

Private AI Framework Cluster… FIXED

This video documents the process of upgrading and resolving issues within a self-hosted private AI framework cluster to enhance performance.

"I made some upgrades to my cluster."

Private AI Infrastructure Cluster Management System Optimization

Key Takeaways

Clustering capable machines enables scalable resource utilization for AI workloads
Regular upgrades are necessary to maintain stability in private AI frameworks
Self-hosting ensures data privacy while leveraging distributed computing power

TUTORIAL intermediate

Your local LLM is 10x slower than it should be

This video explores why local LLM inference often underperforms and provides actionable steps to optimize speed by adjusting backend configurations and hardware utilization.

"You're probably familiar with Olama, right? This is what it looks like, you can launch it, you can talk to it."

Local LLMs Performance Optimization Inference Speed

Key Takeaways

Default settings often limit hardware potential
Backend choice significantly impacts throughput
Quantization can balance speed and quality

NEW_RELEASE intermediate

Dev Workloads and LLMs… under $1000

This video reviews the GeekOM A9 Max mini PC, evaluating its performance for developer workloads and local LLMs while highlighting its sub-$1000 price point.

"The AMD Strix Point chip is finally in a mini PC that's under $1000 in this GeekOM A9 Max that many of you have been asking about."

Mini PC AMD Strix Point Local LLMs Developer Hardware

Key Takeaways

AMD Strix Point chips are now accessible in affordable mini PCs
The GeekOM A9 Max offers viable performance for dev workloads under $1000
Local LLM inference is becoming more feasible on budget hardware

FRESH_TAKE intermediate

Wait, Spark IS faster!

This video challenges the common perception that NVIDIA DGX Spark systems are slow when running AI models, arguing that standard benchmarks may not reflect real-world performance.

"What if I told you that what you've seen so far about the DJX Sparking's slower is wrong?"

AI Hardware Performance DGX Spark Benchmarks GPU Efficiency

Key Takeaways

Single-user benchmarks may misrepresent actual AI workload speed
DGX Spark outperforms expectations in specific contexts
Perceived slowness is often a benchmarking artifact

NEWS

The Cheap Model Got Smoked… Until I Checked the Output

I have a course site that I've had for many years and I need to migrate away from my current provider. Unfortunately, because their pri...

OPINION intermediate

Dev laptop shopping…Safe bet, then and now

This video reviews developer laptop options, comparing historical safe bets with current hardware suitable for local AI inference versus traditional coding workflows.

"I feel like there's two types of developers right now."

Laptop Shopping Developer Hardware Local AI Inference

Key Takeaways

Developers now split between local AI needs and traditional coding
Hardware selection depends on specific workflow requirements
Proven configurations often remain the safest choice

FRESH_TAKE intermediate

Fastest 1,000,000 tokensâ¦ and who paid the most

This video benchmarks different machines to determine which can generate 1 million tokens the fastest, comparing performance and costs.

"I wanted to answer one simple question. Which of these machines can generate 1 million tokens the fastest?"

LLM Benchmarking Token Generation Speed AI Hardware Performance

Key Takeaways

Hardware choice significantly impacts token generation speed
Cost efficiency is a critical factor alongside raw speed
Large-scale testing reveals real-world performance differences

NEW_RELEASE intermediate

Apple JUST Dropped a Game-Changer

Alex Ziskind discusses a new Apple update that potentially solves historical issues with Mac cluster performance and longevity.

"Every Mac cluster I've built in the past has had the same painful ending considerably worse."

Apple Silicon Mac Cluster Hardware Updates

Key Takeaways

Historical Mac clusters often faced painful endings
New updates aim to improve cluster stability and performance
Significant changes in Apple computing technology are emerging

TUTORIAL

Your Mac Has Hidden VRAM… Here's How to Unlock It

If your Apple Silicon machine like Macbook or Little Mac Mini doesn't have much RAM and you still want to run large language models that are decently sized, you can dow...

About This Creator

Messaging Evolution

Video Timeline

March 2026

Other Creators We Track