Karpathy Flash News List | Blockchain.News
Flash News List

List of Flash News about Karpathy

Time Details
2025-12-10
17:25
Andrej Karpathy Announces nanoGPT as First LLM to Train and Run Inference in Space — What Traders Should Know

According to @karpathy, nanoGPT is the first large language model to both train and run inference in space, marking the start of the effort. According to @karpathy, the announcement confirms the initiative has begun but does not disclose technical specifications, mission details, partners, or a timeline. According to @karpathy, the post does not reference any cryptocurrencies, tokens, or market integrations, which limits immediate data-driven trading conclusions and frames this as a sentiment-driven headline for AI and compute narratives.

Source
2025-12-10
17:15
Andrej Karpathy Benchmarks GPT-5.1 Thinking API on 930 Hacker News Threads: 3 Hours Build, 1 Hour Run, $60 Cost

According to @karpathy, he used the GPT-5.1 Thinking API to auto-grade all 930 December 2015 Hacker News frontpage article-discussion pairs to identify the most and least prescient comments, taking about 3 hours to write the code and roughly 1 hour and $60 to run, source: twitter.com/karpathy/status/1998803709468487877 and karpathy.bearblog.dev/auto-grade-hn. According to @karpathy, the project repository is available at github.com/karpathy/hn-time-capsule and the full results are browsable at karpathy.ai/hncapsule, source: twitter.com/karpathy/status/1998803709468487877. According to @karpathy, he emphasized in-hindsight analysis as a practical way to train forward prediction models and noted that future LLMs will perform such work cheaper, faster, and better, source: twitter.com/karpathy/status/1998803709468487877. According to @karpathy, the top 10 most prescient HN accounts for that month were pcwalton, tptacek, paulmd, cstross, greglindahl, moxie, hannob, 0xcde4c3db, Manishearth, and johncolanduoni, source: twitter.com/karpathy/status/1998803709468487877. According to @karpathy, these run-time and cost figures provide a concrete real-world datapoint for large-scale LLM evaluation workflows using GPT-5.1 Thinking, anchored at approximately $60 for a 930-thread pass in about one hour, which traders tracking AI infrastructure efficiency can use as a benchmark, source: twitter.com/karpathy/status/1998803709468487877 and karpathy.bearblog.dev/auto-grade-hn.

Source
2025-12-09
03:40
Python random.seed Sign Bug: seed(5) equals seed(-5) — Critical Risk for AI and Crypto Trading Backtests

According to @karpathy, CPython’s random.seed ignores the sign of integer seeds, so seed(3) and seed(-3) produce identical RNG streams because the implementation takes the absolute value of PyLong arguments (source: twitter.com/karpathy/status/1998236299862659485; source: github.com/python/cpython/blob/main/Modules/_randommodule.c#L321). The Python docs state that if a is an int, it is used directly, and that the core generator is MT19937, but they only guarantee same seed => same sequence and do not promise distinct sequences for different seeds (source: docs.python.org/3/library/random.html). Karpathy reports this caused train=test leakage in his nanochat setup when he used seed sign to separate train/test splits, creating a serious reproducibility and overfitting risk (source: twitter.com/karpathy/status/1998236299862659485). For trading systems and crypto quants using Python for strategy simulation, Monte Carlo VaR, order routing randomness, or ML model evaluation, audit any pipelines that rely on sign-differentiated seeds or assume seed(n) != seed(-n) to avoid biased backtests and invalid performance metrics (source: twitter.com/karpathy/status/1998236299862659485). Actionable mitigations include avoiding negative-vs-positive seed conventions, using string or bytes seeds that are hashed via SHA-512 under version 2 seeding, or explicitly encoding the sign bit as 2*abs(n)+int(n<0) as noted by Karpathy (source: docs.python.org/3/library/random.html; source: twitter.com/karpathy/status/1998236299862659485).

Source
2025-12-07
18:13
Andrej Karpathy says LLMs are simulators, not agents for crypto trading research on BTC and ETH

According to Andrej Karpathy, large language models should be treated as simulators that channel multiple perspectives rather than as entities with their own opinions, source: Andrej Karpathy on X. He advises replacing you centric questions with prompts that ask what different groups would say, which is directly applicable to structuring crypto market research, source: Andrej Karpathy on X. Applying this to trading workflows, practitioners can prompt simulated bulls, bears, and market makers to generate scenario narratives for BTC and ETH without assuming the model holds a personal view, source: Andrej Karpathy on X. He adds that forcing a you voice only makes the model adopt a personality implied by finetuning data statistics, reinforcing role based simulation as the correct mental model for AI assisted analysis, source: Andrej Karpathy on X.

Source
2025-11-24
17:35
Andrej Karpathy’s Definitive View: AI Homework Detection Is Impossible — What Traders Should Know Now

According to @karpathy on X on Nov 24, 2025, AI use in homework cannot be detected and current AI detectors do not work, underscoring inevitable adoption of generative AI in schools (Source: @karpathy on X, Nov 24, 2025). According to @karpathy on X on Nov 24, 2025, he briefed a school board and shared highlights urging adaptation to AI in education rather than reliance on detection tools (Source: @karpathy on X, Nov 24, 2025). According to @karpathy on X on Nov 24, 2025, the post contains no references to cryptocurrencies or trading, indicating no stated direct crypto market impact (Source: @karpathy on X, Nov 24, 2025).

Source
2025-11-23
18:03
Andrej Karpathy Demo: Gemini Nano Banana Pro Solves Exam Image Questions in Real-World Test; Traders Watch GOOGL and AI Tokens RNDR, FET

According to @karpathy, Gemini Nano Banana Pro solved chemistry exam questions directly from an image of the exam page, correctly parsing doodles and diagrams, with ChatGPT later judging the answers correct except for a nomenclature fix on Se2P2 and a spelling correction for thiocyanic acid, source: Andrej Karpathy on X, Nov 23, 2025. The demo evidences in-image multimodal parsing and reasoning on dense document layouts, which aligns with Google’s Gemini family positioning and the inclusion of Nano in the product lineup, source: Andrej Karpathy on X, Nov 23, 2025; Google DeepMind Gemini introduction, Dec 2023. Historically, prominent AI capability reveals have coincided with rotations into AI-linked crypto assets such as RNDR and FET and related equities after major AI news, source: Reuters reporting on AI token rallies during the ChatGPT surge in Feb 2023 and after Nvidia earnings in May 2024. Traders may watch Alphabet GOOGL and AI infrastructure tokens for narrative momentum if this demo draws broader attention, while noting the accuracy risk highlighted by the Se2P2 naming and spelling errors, source: Andrej Karpathy on X, Nov 23, 2025; Reuters Feb 2023 and May 2024.

Source
2025-11-22
23:54
Andrej Karpathy unveils llm-council open-source multi-LLM ensemble via OpenRouter; GPT-5.1 ranked highest by peers, Claude lowest

According to @karpathy, he released an open-source llm-council web app that dispatches each user query to multiple models via OpenRouter, lets models review and rank anonymized responses, and then a Chairman LLM produces the final answer, detailing a concrete multi-LLM ensemble workflow. Source: @karpathy on X. According to @karpathy, the current council includes openai/gpt-5.1, google/gemini-3-pro-preview, anthropic/claude-sonnet-4.5, and x-ai/grok-4, providing side-by-side outputs and rankings across OpenAI, Google, Anthropic, and xAI model families. Source: @karpathy on X. According to @karpathy, cross-model evaluation frequently selects another model’s response as superior, highlighting a practical peer-review method for model selection and ranking. Source: @karpathy on X. According to @karpathy, in his reading tests the models consistently praised GPT-5.1 as the best and most insightful and consistently selected Claude as the worst, with Gemini 3 Pro and Grok-4 in between, while his qualitative take found GPT-5.1 wordy, Gemini 3 more condensed, and Claude too terse. Source: @karpathy on X. According to @karpathy, the code is publicly available for others to try on GitHub under the llm-council repository. Source: @karpathy on X and @karpathy on GitHub. According to @karpathy, the post does not mention cryptocurrencies, tokens, or blockchains, and provides no direct crypto market claims. Source: @karpathy on X.

Source
2025-11-22
02:11
Andrej Karpathy seeks quantitative definition of AI 'slop' and a measurable 'slop index' using LLM miniseries and thinking token budgets for evaluation

According to @karpathy, he is seeking a quantitative, measurable definition of AI 'slop' and notes he has an intuitive 'slop index' but lacks a formal metric. Source: @karpathy on X, Nov 22, 2025. According to @karpathy, potential approaches he is considering include using LLM miniseries and analyzing thinking token budgets to quantify output quality and cost. Source: @karpathy on X, Nov 22, 2025. For traders in AI and crypto-adjacent markets, this post highlights an active gap in standardized LLM quality metrics that directly ties to model evaluation and cost controls, which are key inputs for pricing and benchmarking AI products. Source: @karpathy on X, Nov 22, 2025.

Source
2025-11-21
16:43
Andrej Karpathy on AI Intelligence Diversity: No Direct Crypto Trading Catalyst for Markets

According to @karpathy, the space of intelligences is large and animal intelligence is only a single point arising from a specific optimization process fundamentally distinct from that of artificial systems. Source: @karpathy on X, Nov 21, 2025. The post is conceptual and provides no product announcements, model releases, datasets, performance metrics, timelines, or any crypto asset or token mentions, indicating no direct trading catalyst for crypto or equities. Source: @karpathy on X, Nov 21, 2025. For crypto market context, this statement aligns with the broader AI agents and autonomous intelligence narrative, but the source offers no on-chain, protocol, or market data. Source: @karpathy on X, Nov 21, 2025.

Source
2025-11-18
00:29
Andrej Karpathy details 3-pass LLM reading workflow and shift toward writing for LLMs

According to @karpathy, he now reads blogs, articles, and book chapters using a three-pass LLM workflow: pass 1 manual reading, pass 2 explain and summarize, and pass 3 Q&A, which he says yields a deeper understanding than moving on, source: @karpathy on X, Nov 18, 2025. He adds that this habit is growing into one of his top LLM use cases, source: @karpathy on X, Nov 18, 2025. He also states that writers may increasingly write for an LLM so the model first internalizes the idea and then targets, personalizes, and serves it to users, source: @karpathy on X, Nov 18, 2025. The post does not mention cryptocurrencies or trading signals, indicating any crypto market relevance would be indirect via LLM usage patterns in content consumption and personalization, source: @karpathy on X, Nov 18, 2025.

Source
2025-11-17
18:56
Crypto Trading Discipline: Andrej Karpathy Urges Principles Over Galaxy Brain Rationalization with 2 Actionable Strategies for Volatile Markets

According to @karpathy, traders should prioritize rule-based principles and avoid post-hoc galaxy brain justifications, citing two actionable strategies: have principles and hold the right bags, financially and socially; source: @karpathy on X, Nov 17, 2025; x.com/VitalikButerin/status/1986906940472238108. According to @karpathy, applying constraint-based rules akin to simple guardrails is preferable to flexible utility calculus, reinforcing disciplined entries, position sizing, and clear no-trade conditions during volatility; source: @karpathy on X, Nov 17, 2025. According to @karpathy, aligning positions with long-term conviction and social capital helps avoid rotating into narratives you cannot defend under stress, supporting consistent execution in crypto markets; source: @karpathy on X, Nov 17, 2025.

Source
2025-11-16
17:56
AI Software 2.0 and Verifiability: Trading Implications for Crypto Markets (BTC, ETH) from @karpathy in 2025

According to @karpathy, AI should be viewed as Software 2.0 that optimizes programs against explicit objectives, making task verifiability the primary predictor of automation readiness, source: @karpathy on X, Nov 16, 2025. He states that verifiable tasks are those with resettable environments, efficient iteration, and automated rewards, enabling gradient descent or reinforcement learning to practice at scale, source: @karpathy on X, Nov 16, 2025. He adds that such tasks progress rapidly and can surpass top experts in domains like math and code, while creative and context-heavy tasks lag, source: @karpathy on X, Nov 16, 2025. Interpreted for trading, crypto workflows with clear, checkable outcomes such as strategy backtests, execution slippage minimization, market making simulations, and on-chain anomaly detection align with the verifiable category and are thus more automatable under this framework, source: interpretation based on @karpathy on X, Nov 16, 2025. Conversely, discretionary macro narratives and multi-step fundamental synthesis without fast feedback are less automatable near term, shaping where AI edges may emerge across BTC and ETH trading pipelines, source: interpretation based on @karpathy on X, Nov 16, 2025.

Source
2025-11-13
21:12
Self-Driving Will Reshape Cities: Andrej Karpathy’s 2025 Call and 5 Trading Takeaways for AI Crypto Tokens (FET, RNDR, AGIX, OCEAN)

According to @karpathy, self-driving will cut parked cars and parking lots, improve safety, reduce noise, reclaim urban space, and enable cheaper programmable delivery, framing a step-change in real-world automation rather than a gradual tweak, which can act as a sentiment catalyst for AI and robotics narratives in risk assets, including crypto. Source: @karpathy on X, Nov 13, 2025. For traders, the immediate read-through is to watch AI-narrative crypto tokens such as FET, RNDR, AGIX, and OCEAN for potential narrative rotation flows tied to autonomous logistics and edge-AI enthusiasm sparked by this commentary. Source: @karpathy on X, Nov 13, 2025.

Source
2025-11-12
20:28
Tesla FSD v13 on HW4 delivers flawless drive reported by @karpathy - TSLA trading takeaways

According to @karpathy, a new HW4 Tesla Model X running FSD v13 completed a smooth, confident highway and city route that handled lane centering, construction detours, tricky left turns, four-way stops, bus overtakes, dense merges, parking, and ended as a perfect drive with no notes, indicating a markedly better experience than HW3. Source: Andrej Karpathy on X, Nov 12, 2025. According to @karpathy, the results reflect FSD v13 on HW4 because his car has not yet received v14, providing a current field-performance reference for traders tracking Tesla’s autonomy progress. Source: Andrej Karpathy on X, Nov 12, 2025. According to @karpathy, progress is driven by an end-to-end long-context neural network that processes surround video at 60 Hz with multimodal sensor streams over roughly 30 seconds, with technical hints attributed to Ashok Elluswamy’s ICCV25 talk. Source: Andrej Karpathy on X, Nov 12, 2025; Ashok Elluswamy on X (ICCV25 talk referenced by Karpathy). According to @karpathy, this firsthand report underscores a material performance gap in favor of HW4 versus HW3 for FSD v13, a datapoint TSLA-focused traders can use when evaluating hardware-driven capability differences in Tesla’s fleet. Source: Andrej Karpathy on X, Nov 12, 2025. According to @karpathy, no cryptocurrencies, blockchain integrations, or digital assets are mentioned in this report, implying no direct crypto market linkage in the update. Source: Andrej Karpathy on X, Nov 12, 2025.

Source
2025-10-26
16:24
PyTorch MPS addcmul_ Silent-Failure Bug on Non-Contiguous Tensors Flags AI Training Risk: What Traders Should Watch

According to @karpathy, a detailed debugging investigation traced a suspicious training loss curve to a PyTorch MPS backend issue where addcmul_ silently fails on non-contiguous output tensors in the Objective-C++ path, pointing to a correctness bug that does not throw errors during training; Source: @karpathy on X https://twitter.com/karpathy/status/1982483540899237981 and the referenced thread by @ElanaPearl https://x.com/ElanaPearl/status/1981389648695025849. For AI workflow reliability, this implies Mac Apple MPS-based training can yield incorrect results without explicit runtime alerts, directly impacting the integrity of model training and evaluation pipelines used by practitioners; Source: @karpathy on X https://twitter.com/karpathy/status/1982483540899237981 and @ElanaPearl on X https://x.com/ElanaPearl/status/1981389648695025849. For traders, treat this as a software reliability risk flag within the AI toolchain and monitor official PyTorch or Apple MPS updates and release notes that reference addcmul_ or non-contiguous tensor handling, as confirmed fixes would reduce operational uncertainty around AI workloads that markets track for sentiment; Source: @karpathy on X https://twitter.com/karpathy/status/1982483540899237981 and @ElanaPearl on X https://x.com/ElanaPearl/status/1981389648695025849.

Source
2025-10-24
15:35
Karpathy Unveils SpellingBee for nanochat d32: Step-by-Step SFT/RL Finetuning Guide to Add Letter-Counting Capability and Its AI-Token Implications

According to @karpathy, he released a full guide showing how a new synthetic task called SpellingBee teaches nanochat d32 to count letters in words like strawberry by generating user-assistant training pairs and midtraining or SFT finetuning, with optional RL to improve robustness, source: Karpathy X post dated Oct 24, 2025; GitHub nanochat discussion 164. The method stresses diverse user prompts, careful tokenization and whitespace handling, breaking reasoning into multiple tokens by standardizing the word, spelling it out, iterating with an explicit counter, and encouraging two solution paths via manual reasoning and Python tool use, source: Karpathy X post dated Oct 24, 2025; GitHub nanochat discussion 164. Karpathy notes that because nanochat d32 is small, the capability is encouraged by over-representing examples in the dataset, and reliability can be further improved by simulating mistakes in data or running RL, source: Karpathy X post dated Oct 24, 2025; GitHub nanochat discussion 164. For traders, open-source progress on small LLM tooling has coincided with episodic attention flows to AI-linked crypto assets such as RNDR, FET, and AGIX around major AI catalysts, with Kaiko reporting AI token rallies around Nvidia earnings in 2024, source: Kaiko Research 2024 weekly market reports; Nvidia 2024 earnings releases. No token or product launch is included here; this is a technical training guide and example set for capability injection into a small LLM, source: Karpathy X post dated Oct 24, 2025; GitHub nanochat discussion 164.

Source
2025-10-21
15:59
Andrej Karpathy Unveils nanochat d32: $800 Synthetic-Data Custom LLM Identity and Script Release, Key Signals for AI Agent Builders

According to @karpathy, nanochat now carries a defined identity and can state its capabilities, including that it is nanochat d32 built by him with a reported $800 cost and weaker non-English proficiency, achieved via synthetic data generation, source: x.com/karpathy/status/1980508380860150038. He released an example script that demonstrates generating diverse synthetic conversations and mixing them into mid-training or SFT, stressing the importance of entropy to avoid repetitive datasets, source: x.com/karpathy/status/1980508380860150038. He adds that base LLMs lack inherent personality or self-knowledge and require explicitly bolted-on traits via curated synthetic data, source: x.com/karpathy/status/1980508380860150038. For traders, the disclosed $800 customization benchmark and open-source workflow provide concrete cost and process reference points for evaluating open-source AI agent development and adoption paths across AI-linked tokens and AI-exposed equities, source: twitter.com/karpathy/status/1980665134415802554.

Source
2025-10-20
22:13
Andrej Karpathy: DeepSeek-OCR Signals 4 Reasons Pixels May Beat Text Tokens for LLM Inputs — Efficiency, Shorter Context Windows, Bidirectional Attention, No Tokenizer

According to Andrej Karpathy, the DeepSeek-OCR paper is a strong OCR model and more importantly highlights why pixels might be superior to text tokens as inputs to large language models, emphasizing model efficiency and input fidelity, source: Andrej Karpathy on X, Oct 20, 2025. He states that rendering text to images and feeding pixels can deliver greater information compression, enabling shorter context windows and higher efficiency, source: Andrej Karpathy on X, Oct 20, 2025. He adds that pixel inputs provide a more general information stream that preserves formatting such as bold and color and allows arbitrary images alongside text, source: Andrej Karpathy on X, Oct 20, 2025. He argues that image inputs enable bidirectional attention by default instead of autoregressive attention at the input stage, which he characterizes as more powerful for processing, source: Andrej Karpathy on X, Oct 20, 2025. He advocates removing the tokenizer at input due to the complexity and risks of Unicode and byte encodings, including security or jailbreak issues such as continuation bytes and semantic mismatches for emojis, source: Andrej Karpathy on X, Oct 20, 2025. He frames OCR as one of many vision-to-text tasks and suggests many text-to-text tasks can be reframed as vision-to-text, while the reverse is not generally true, source: Andrej Karpathy on X, Oct 20, 2025. He proposes a practical setup where user messages are images while the assistant response remains text and notes outputting pixels is less obvious, and he mentions an urge to build an image-input-only version of nanochat while referencing the vLLM project, source: Andrej Karpathy on X, Oct 20, 2025.

Source
2025-10-20
18:58
Karpathy on Text Diffusion for LLMs (2025): Bidirectional Attention Raises Training Cost vs Autoregression

According to @karpathy, text diffusion for language can be implemented with a vanilla transformer using bidirectional attention that iteratively re-masks and re-samples all tokens on a noise schedule. Source: @karpathy. He states diffusion is the pervasive generative paradigm in image and video, while autoregression remains dominant in text and audio shows a mix of both. Source: @karpathy. He adds that removing heavy formalism reveals simple baseline algorithms, with discrete diffusion closer to flow matching in continuous settings. Source: @karpathy. He explains that autoregression appends tokens while attending backward, whereas diffusion refreshes the entire token canvas while attending bidirectionally. Source: @karpathy. He notes bidirectional attention yields stronger language models but makes training more expensive because sequence dimension parallelization is not possible. Source: @karpathy. He suggests it may be possible to interpolate or generalize between diffusion and autoregression in the LLM stack. Source: @karpathy. For traders, the actionable takeaway is the compute cost trade-off of bidirectional text diffusion versus autoregression, which directly affects training efficiency assumptions. Source: @karpathy.

Source
2025-10-18
20:23
Karpathy’s Decade of Agents: 10-Year AGI Timeline, RL Skepticism, and Security-First LLM Tools for Crypto Builders and Traders

According to @karpathy, AGI is on roughly a 10-year horizon he describes as a decade of agents, citing major remaining work in integration, real-world sensors and actuators, societal alignment, and security, and noting his timeline is 5-10x more conservative than prevailing hype, source: @karpathy on X, Oct 18, 2025. He is long agentic interaction but skeptical of reinforcement learning due to poor signal-to-compute efficiency and noise, and he highlights alternative learning paradigms such as system prompt learning with early deployed examples like ChatGPT memory, source: @karpathy on X, Oct 18, 2025. He urges collaborative, verifiable LLM tooling over fully autonomous code-writing agents and warns that overshooting capability can accumulate slop and increase vulnerabilities and security breaches, source: @karpathy on X, Oct 18, 2025. He advocates building a cognitive core by reducing memorization to improve generalization and expects models to get larger before they can get smaller, source: @karpathy on X, Oct 18, 2025. He also contrasts LLMs as ghost-like entities prepackaged via next-token prediction with animals prewired by evolution, and suggests making models more animal-like over time, source: @karpathy on X, Oct 18, 2025. For crypto builders and traders, this points to prioritizing human-in-the-loop agent workflows, code verification, memory-enabled tooling, and security-first integrations over promises of fully autonomous AGI, especially where software defects and vulnerabilities carry on-chain risk, source: @karpathy on X, Oct 18, 2025.

Source