Here is what Qwen 3.5 still cannot answer, still struggles with C64 coding skills. Full transcript (Qwen3.5-35B-A3B)

(edited)

Habe die neuen Qwen 3.5 Modelle getestet, zumindest die, die ich lokal auf meinem PC ausführen kann wie das Qwen3.5-35B-A3B, ein Mixture of Experts Modell mit 3 Milliarden aktiven Parametern.

Schaffe mit meiner alten GeForce 3060 GPU mit 12 GB VRAM und 64 GB RAM circa 10 Tokens in der Sekunde mit der Q5 KM Quantisierung. Eine Reasoning-Antwort im Thinking-Modus dauert circa 1-5 Minuten. Mit der Q4 Quantisierung komme ich auf circa 13 Tokens pro Sekunde. User mit 4090 oder 5090 Karten berichten sogar 50-60 Tokens pro Sekunde lokal zu schaffen, wenn sie die gesamte KI in den VRAM laden können.

Was die KI aber immer noch nicht kann ist, ein funktionierendes Hello-World Programm für den C64 zu programmieren, obwohl sich die Coding-Skills mittlerweile zur Überraschung von vielen als Stärke von LLMs herauskristallisiert haben.

Vermutlich fehlt es auch auch an qualitativ hochwertigen Trainingsdaten für den C64. C64 Basic dürfte nicht der Schwerpunkt der Coding LLMs sein.

Im "SWE-Bench Verified"-Benchmark erreichen die neuen Qwen 3.5-Modelle etwa um die 70%. Zum Vergleich Claude Sonnet 4.5 kommt auf 62%, GPT 5.3 Codex auf etwas 75% und Opus 4.6 sogar auf 79%.

Habe für die Coding-Aufgabe auch qwen3.5-27b getestet, beim erste Shot hat sich die KI verrannt und hat intern über 1 Stunde über die korrekte Adresse der C64-Hintergrundfarbe debattiert und ist zu keinem Ergebnis gekommen.

Beim zweiten Shot hat das 27B Modell wieder ein leicht fehlerhaftes Programm ausgegeben.

Insgesamt ist die Qualität der Antworten deutlich besser geworden und es ist wirklich erstaunlich und mega cool, dass man jetzt eine KI auf dem Niveau von Claude Sonnet 4.5 auf dem eigenen Rechner ausführen kann.

Was sagt ihr zu den Fortschritten bei den lokalen LLMs? Denkt ihr, werden wir langfristig sobald die RAM-Krise gelöst ist und es bessere High-End Consumer GPUs mit ausreichend VRAM geben wird, keine teuren KI-Cloud-Anbieter mehr brauchen oder wird Nvidia die Consumer-GPU-Entwicklung künstlich verlangsamen, um mehr Geld mit Server-Karten zu verdienen? Was würde das für das Geschäftsmodell von OpenAI & Co. bedeuten?

image.png

https://chat.qwen.ai/

English

I tested the new Qwen 3.5 models, at least the ones I can run locally on my PC, such as the Qwen3.5-35B-A3B, a mixture of experts model with 3 billion active parameters.

With my old GeForce 3060 GPU with 12 GB VRAM and 64 GB RAM, I can manage about 10 tokens per second with Q5 KM quantization. A reasoning response in Thinking mode takes about 1-5 minutes. With Q4 quantization, I get about 13 tokens per second. Users with 4090 or 5090 cards even report achieving 50-60 tokens per second locally when they can load the entire AI into the VRAM.

However, what the AI still cannot do is program a working Hello World program for the C64, even though coding skills have now emerged as a strength of LLMs, to the surprise of many.

Presumably, there is also a lack of high-quality training data for the C64. C64 Basic is unlikely to be the focus of coding LLMs.

In the SWE-Bench Verified Benchmark, the new Qwen 3.5 models achieve around 70%. For comparison, Claude Sonnet 4.5 scores 62%, GPT 5.3 Codex around 75%, and Opus 4.6 even 79%.

I also tested qwen3.5-27b for the coding task. On the first shot, the AI got stuck and spent over an hour internally debating the correct address for the C64 background color without reaching a conclusion.

On the second shot, the 27B model again output a slightly faulty program.

Overall, the quality of the answers has improved significantly, and it's really amazing and mega cool that you can now run AI at the level of Claude Sonnet 4.5 on your own computer.

What do you think about the progress made with local LLMs? Do you think that in the long term, once the RAM crisis is resolved and there are better high-end consumer GPUs with sufficient VRAM, we will no longer need expensive AI cloud providers, or will Nvidia artificially slow down consumer GPU development to make more money with server cards? What would that mean for the business model of OpenAI & Co.?

Qwen3.5-35B-A3B

Quantization: 5 Bit K M
Qwen3.5-35B-A3B-Q5_K_M.gguf
Parameters: LM Studio default
Context: 21040 Tokens
One Shot
RAM: 35 GB
VRAM: 10 GB
Performance: about 10 Tokens / sec
Runtime: 1-5 minutes per prompt

https://huggingface.co/Qwen/Qwen3.5-35B-A3B

https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF

Interface (LM Studio)

https://lmstudio.ai/

Full Transcript

Q

What is the capital city of Austria? Answer only the name of the city.

A (Qwen3.5-35B-A3B)

Vienna

Result

Correct

Q

Is Bitcoin a security or commodity?

A (Qwen3.5-35B-A3B)

There is no single, universally agreed-upon answer. The classification of Bitcoin depends heavily on which country you are in and which regulatory body is defining the rules.

However, in the United States—the most influential market for financial regulation—Bitcoin is primarily treated as a commodity, though there are nuances regarding securities laws.

Here is a breakdown of how different authorities classify it:

1. The U.S. Commodity Futures Trading Commission (CFTC)

Status: Commodity
The CFTC, which regulates derivatives markets like futures and swaps, officially classifies Bitcoin as a commodity under the Commodity Exchange Act. This classification allows for the trading of Bitcoin futures and options on regulated exchanges (like the Chicago Mercantile Exchange). The CFTC argues that because Bitcoin is decentralized and does not rely on the management of a central entity to determine its value, it fits the definition of a commodity.

2. The U.S. Securities and Exchange Commission (SEC)

Status: Ambiguous / Largely Commodity (but with oversight)
The SEC has historically been cautious but generally acknowledges that Bitcoin itself is not a security.

  • The Howey Test: Under the Howey Test, an asset is a security if there is an investment of money in a common enterprise with a reasonable expectation of profits derived from the efforts of others. Because Bitcoin is decentralized (no central company or founder controls its network), it arguably fails the "efforts of others" prong, suggesting it is not a security.
  • SEC Actions: In recent lawsuits against exchanges like Coinbase and Binance, the SEC has argued that many tokens sold on those platforms are securities, but they have generally treated Bitcoin differently in their legal filings. For example, when approving Spot Bitcoin ETFs in 2024, the SEC approved them under regulations that imply an underlying asset that is not a security (or at least distinct enough from other crypto assets).
  • Note: The SEC does regulate Bitcoin exchanges for fraud and market manipulation reasons, but they generally do not claim ownership of the coin itself.

3. The Internal Revenue Service (IRS)

Status: Property
For tax purposes, the IRS treats Bitcoin as property, similar to how gold or real estate is treated. It is not legal tender and it is not a currency in this context. This means:

  • Buying Bitcoin with USD is a non-taxable event.
  • Selling Bitcoin for a profit results in a capital gain/loss that must be reported.

4. Other Jurisdictions

  • Japan: Recognizes Bitcoin as "cryptocurrency" or "virtual currency," distinct from securities but regulated similarly to money.
  • European Union (MiCA): The Markets in Crypto-Assets regulation classifies crypto-assets into categories, generally treating stablecoins and utility tokens differently than investment tokens, but often aligning with commodity-like treatment for decentralized assets like Bitcoin.
  • El Salvador: Made Bitcoin legal tender.

Why Does This Classification Matter?

The distinction between "security" and "commodity" dictates who regulates Bitcoin and what rules apply:

  • Commodity Regulation (CFTC): Generally allows for more innovation in derivatives, futures, and trading platforms with fewer restrictions on offering the asset to investors.
  • Security Regulation (SEC): If classified as a security, selling Bitcoin would require registration, strict disclosure filings, and access might be limited to accredited investors (though this is rarely applied to BTC itself).
  • Taxation: Determines how gains are calculated (capital gains vs. ordinary income).

Summary

In the current legal landscape of the United States:

  1. CFTC says it is a Commodity.
  2. SEC generally treats it as not a security, though they regulate the exchanges that sell it.
  3. IRS treats it as Property.

While a definitive Supreme Court ruling declaring Bitcoin's exact nature has not occurred, the consensus among financial regulators is that Bitcoin functions as a commodity/property rather than a security, due to its decentralized nature.

Disclaimer: I am an AI, not an attorney or financial advisor. Laws regarding cryptocurrency are evolving rapidly and vary by jurisdiction. For specific legal or investment decisions, consult a qualified professional.

Result

Correct

Q

What is the hive blockchain?

A (Qwen3.5-35B-A3B)

The Hive Blockchain is a public, open-source blockchain platform designed specifically for social applications, community governance, and decentralized content creation. It is often described as a successor or fork of the Steem blockchain.

Here is a detailed breakdown of what Hive is, how it works, its history, and its ecosystem.


1. Origin Story: The Fork from Steem

To understand Hive, you must understand why it exists.

  • The Context: In 2016, the blockchain project Steem launched with Steemit, a social media platform where users are paid in cryptocurrency (STEEM) to create and curate content.
  • The Conflict: In April 2020, the company behind Steem (Steemit, Inc.) faced controversy regarding transparency, control over user data, and the potential selling of "Genesis Blocks" (large amounts of tokens). The community felt they no longer trusted the centralized team with the network's future.
  • The Fork: Instead of abandoning the network or allowing the company to continue unchallenged, the community executed a hard fork in June 2020. They created Hive (HIVE), separating themselves from Steemit, Inc. This was one of the first major examples of a crypto community "taking back" their blockchain to ensure decentralized ownership.

2. Core Philosophy & Goals

  • Community Ownership: Hive is owned by its users and token holders, not a central corporation.
  • Social First: Unlike Ethereum (which is general-purpose), Hive was architected specifically for social media, blogging, and community interaction.
  • Low Friction: The goal is to make using the blockchain seamless. Transactions should feel free or cost almost nothing compared to traditional crypto networks like Ethereum.

3. How It Works (Technology)

Hive uses a consensus mechanism called Delegated Proof of Stake (DPoS).

  • Witnesses: Instead of miners (like Bitcoin), there are "Witnesses" who validate transactions and create blocks. There are currently 20 active witnesses running the nodes.
  • Block Time: Hive produces a new block every 3 seconds, making it very fast compared to older chains like Bitcoin or Ethereum (legacy).
  • Tokenomics & Governance: Users vote for Witnesses using their tokens. If a Witness acts badly, voters can remove them and replace them with someone else. This gives the community direct control over the network's operation.

4. The Tokens: HIVE vs. HBD

Hive has two primary currencies that function differently:

  1. HIVE:
    • This is the native, volatile token of the blockchain.
    • It is used for governance (voting on Witnesses).
    • Users can "Power Up" their HIVE to create HP (Hive Power), which determines how much voting power they have in the network and how many rewards they receive.
  2. HBD (Hive Backed Dollar):
    • This is a stablecoin pegged 1:1 to the US Dollar.
    • It allows users on Hive to earn interest or pay for services without being exposed to the price volatility of HIVE.

5. The Ecosystem & Use Cases

Hive supports several types of applications (dApps):

  • Social Media: Platforms like PeakD, Ecency, and Hive.blog allow users to post blogs, comments, and images, earning rewards in HBD/HIVE based on community upvotes. It is often called the "Decentralized Reddit."
  • Gaming (Play-to-Earn): Hive is a hub for blockchain gaming due to its speed and lack of transaction fees. The most famous game built on Hive is Splinterlands, which has hundreds of thousands of players earning crypto rewards.
  • NFTs: Hive has its own native NFT standard that does not require gas fees. You can mint, buy, and sell digital art or collectibles for free (or at a negligible cost).
  • Hive Engine: This is a sidechain technology running on top of the main Hive blockchain. It allows developers to create custom tokens and smart contracts without paying transaction fees on the main chain.

6. Advantages vs. Disadvantages

ProsCons
Free Transactions: Users don't pay gas fees for posting or basic interactions (costs are subsidized by block rewards).Smaller Ecosystem: Far fewer developers and projects compared to Ethereum, Solana, or BSC.
Community Governance: Real decentralized control; no single company can dictate terms.Volatility: The HIVE token price fluctuates significantly like other altcoins.
Speed: 3-second block times allow for instant confirmations.Centralization Concerns: Critics argue that the Witness voting mechanism could be manipulated if a few large whale accounts hold enough power (though this is mitigated by community vigilance).
Eco-Friendly: DPoS requires far less energy than Proof-of-Work systems like Bitcoin.Complexity: Setting up wallets and understanding HP (Power) vs. HIVE can be confusing for new users.

Summary

The Hive blockchain is a community-owned social platform built to resist corporate control. It offers a user experience similar to Web 2.0 platforms (like Facebook or Reddit) but with the benefit that the data and rewards belong to the users rather than an advertising conglomerate.

While it will likely never rival Ethereum in terms of total market cap or developer activity, it remains one of the most successful examples of decentralized social media and is a leader in "Play-to-Earn" gaming within the crypto space.

Result

Partly correct (hostile takeover not fully explained, launch date was March not June 2020)

Q

What is 2^8?

A (Qwen3.5-35B-A3B)

2^8 = 256

Result

Correct

Q

Can you compute the 8th root of 2^8?

A (Qwen3.5-35B-A3B)

That equals 2.

Here is the step-by-step breakdown:

  1. The "8th root" can be written as an exponent of $1/8$.
  2. So, you are looking at $(2^8)^{1/8}$.
  3. When you raise a power to another power, you multiply the exponents: $8 \times \frac{1}{8} = 1$.
  4. This leaves you with $2^1$, which is 2.

Result

Correct

Q

Solve the puzzle: When I was 6 my sister was half my age, now i'm 70. How old's my sister? Let's think step by step.

A (Qwen3.5-35B-A3B)

The answer is 67.

Here is the high-level reasoning: When you were 6, your sister was half that age (3), which establishes a permanent age difference of 3 years between you two. Since this gap never changes, when you are now 70, your sister is simply 3 years younger, making her 67.

Result

Correct. There is also a hidden subtlety (67 +/-1 year depending on the reference dates).

Q (Bar test)

Answer the following exam question based on US law:

A father lived with his son, who was an alcoholic. When drunk, the son often became violent and physically abused his father. As a result, the father always lived in fear. One night, the father heard his son on the front stoop making loud obscene remarks. The father was certain that his son was drunk and was terrified that he would be physically beaten again. In his fear, he bolted the front door and took out a revolver. When the son discovered that the door was bolted, he kicked it down. As the son burst through the front door, his father shot him four times in the chest, killing him. In fact, the son was not under the influence of alcohol or any drug and did not intend to harm his father. At trial, the father presented the above facts and asked the judge to instruct the jury on self-defense. How should the judge instruct the jury with respect to self-defense?

(A) Give the self-defense instruction, because it expresses the defense’s theory of the case.
(B) Give the self-defense instruction, because the evidence is sufficient to raise the defense.
(C) Deny the self-defense instruction, because the father was not in imminent danger from his son.
(D) Deny the self-defense instruction, because the father used excessive force

What is the correct answer?

A (Qwen3.5-35B-A3B)

The correct answer is (B) Give the self-defense instruction, because the evidence is sufficient to raise the defense.

Explanation:

In US criminal law (both common law and under the Model Penal Code), a trial judge must give a jury instruction on a specific defense if there is sufficient evidence presented at trial that would allow a reasonable jury to find in favor of the defendant regarding that defense. The judge acts as a gatekeeper to ensure legal sufficiency, but does not weigh credibility or resolve factual disputes that are left for the jury.

  1. Sufficiency of Evidence: To raise self-defense, there must be evidence supporting the elements: (1) an honest belief of imminent harm, and (2) the reasonableness of that belief (in most jurisdictions).

    • Here, the father presented evidence of a long history of abuse by the son when drunk.
    • On the night in question, he heard loud remarks and saw the son kick down the front door to enter the home.
    • These facts provide a factual basis from which a jury could infer that the father reasonably believed his life was in imminent danger at the moment he pulled the trigger.
  2. Mistake of Fact: Self-defense law often protects defendants who make a reasonable mistake regarding the circumstances (e.g., believing an attacker is threatening when they are actually sober, or believing deadly force is necessary to stop a non-deadly threat). Even though the son was actually sober and did not intend harm, self-defense depends on what the father reasonably perceived at the time. Because there is evidence suggesting his belief could have been reasonable based on past history and current behavior (kicking down a door), this is a question of fact for the jury, not a legal bar decided by the judge.

  3. Why other options are incorrect:

    • (A) A defense instruction is not required merely because it expresses a theory; there must be evidentiary support to raise that theory to the level of a factual issue for the jury.
    • (C) & (D) Denying an instruction based on imminence or excessive force implies that the judge made a final factual determination on those points (that no danger existed or the force was definitely too much). Since these are factual disputes dependent on how a reasonable person would have perceived the situation, they should be decided by the jury. If there is evidence supporting the claim, the instruction must be given so the jury can weigh the facts.

Therefore, because there is enough evidence for a jury to potentially find self-defense valid (including reasonable mistake of fact), the judge must provide the instruction.

Correct Answer: (B)

Result

Correct

Q (GMAT test)

Select the best of the answer choices given.

The interview is an essential part of a successful hiring program because with it job applicants who have personalities that are unsuited to the requirements of the job will be eliminated from consideration.

This argument logically depends on which of the following assumptions?

A: A Hiring program will be successful if it includes interviews.

B: The interview is a more important part of a successful hiring program than is the development of a job description.

C: Interviewers can accurately identify applicants whose personalities are unsuited to the requirements of the job.

D: The only purpose of an interview is to evaluate whether job applicants’ personalities are suited to the requirements of the job.

E: The fit of job applicants’ personalities to the requirements of the job was once the most important factor in making hiring decisions.

A (Qwen3.5-35B-A3B)

The correct answer is C.

Analysis of the Argument:

  • Conclusion: The interview is an essential part of a successful hiring program.
  • Premise/Reasoning: Interviews allow for the elimination of applicants whose personalities are unsuited to the job requirements.

Logical Gap:
The argument assumes that the mere existence of an interview leads to the actual elimination of unsuitable candidates. For this reasoning to hold true, it must be possible for the interviewers to correctly determine (identify) which candidates have unsuited personalities during the process. If interviewers cannot accurately identify these candidates, then interviews would not effectively eliminate them, and the conclusion that interviews are essential for this reason would collapse.

Evaluation of Options:

  • A: This suggests that including an interview guarantees success (sufficient condition), whereas the argument claims it is a necessary component ("essential part"). Furthermore, this does not address the specific mechanism regarding personality evaluation.
  • B: The argument does not compare interviews to job descriptions; it simply states their importance within the program.
  • C: This is the correct assumption. If interviewers cannot accurately identify unsuited personalities (negation of C), then the claim that interviews eliminate these candidates becomes false, destroying the support for the conclusion.
  • D: The argument does not require the interview to have only this purpose; it only requires that this specific function makes it essential. Other purposes could exist without invalidating the argument.
  • E: The historical context of hiring decisions is irrelevant to the current logical structure regarding what constitutes a successful program now.

Therefore, the argument logically depends on the assumption that interviewers are capable of performing the task described: identifying unsuited personalities.

Result

correct

Q

Write a hello world program in Commodore 64 Basic that flickers the screen.

A (Qwen3.5-35B-A3B)

Here is a Commodore 64 BASIC program that prints "HELLO WORLD" and then creates a rapid strobe/flicker effect by toggling the background color of the screen memory register.

To run this, copy it into a C64 emulator (like VICE) or a real machine with a cartridge/monitor interface, then type RUN.

10 REM *** C64 HELLO WORLD FLICKER PROGRAM ***
20 PRINT CHR$(147); : REM Clear screen and home cursor (CMD=147)
30 PRINT "HELLO WORLD"
40 PRINT "PRESS ANY KEY TO STOP"
50 POKE 53280, 7 : REM Set Background Color to White (Register $D020)
60 FOR I = 1 TO 50 : NEXT I : REM Short delay loop for timing
70 POKE 53280, 0 : REM Set Background Color to Black
80 GOTO 60

How it works:

  1. Line 20: PRINT CHR$(147) sends the clear-screen command to the VIC-II chip. This ensures you start with a clean slate.
  2. Line 30: Prints the text "HELLO WORLD" in white on a black background (default settings).
  3. Line 50 & 70: These lines use POKE to write directly into memory addresses controlled by the video chip ($D020 = 53280 is the Background Color register).
    • 7 is White.
    • 0 is Black.
  4. Line 60: A simple FOR...NEXT loop acts as a delay. Without this, BASIC would toggle the color so fast that your eyes might only see white (or it could cause timing issues). This creates a visible "strobing" or flickering rhythm.

Effect:

The text will flash on and off rapidly because when the background turns White, the White text becomes invisible against it. When the background returns to Black, the text reappears. This creates a high-contrast strobe effect common in C64 demo intros.

To Stop: Press RUN/STOP then RESTORE, or press a key if you add a check loop (like checking $53280 status), though on standard BASIC, the screen is usually stuck until power-cycled or reset unless you interrupt execution manually with C (in VICE) to break.

Note: For real hardware safety, do not stare at high-speed strobes for extended periods.

Result

Incorrect, the program does not flicker the screen, and hallucinated "press any key to stop" without implementing this function.

0.11152313 BEE
5 comments

Local llm are the only proper way to use ai, no control, no censorship, a sort of hive on ai field

0.00444084 BEE

That's the future imo, business won't fancy the idea of trusting SaaS LLMs.

0.00444005 BEE

Agreed. Anything created by a centralised LLM belongs to the centralised provider (de facto if not de jure).

Owning your own AI is the key to independence & prosperity.

0.00000000 BEE

The speed at which AI is advancing is incredible.

0.00440378 BEE

I'm repurposing a 9 year old X99 128Gb RAM system that I bought as a Hive node with a AMD R9700 AI Pro GPU as a home NAS/HA/AI server. I've got the NAS, Home Automation, & Jellyfin video serving working well but need my kids to go back to school (home due to war) for me to to implement the AI side.

0.00000000 BEE

Die Philippines sind das Land der Psychopathen, und die meisten arbeiten in der philippinischen Regierung: [Korruption auf den Philippinen] HOW TO STAND UP TO A DICTATOR - Deutsche Ausgabe 5/128

-0.00000451 BEE