> TheAuditor / blog
polyglot, sast, taint

Cross-Language Taint Analysis Is Real Now

Every commercial SAST tool stops at the exec call. Ours doesn't. Here is what tracing a single piece of user input across multiple languages actually looks like.

Pick any commercial SAST tool on the market. Hand it this:

exec.Command("bash", "./deploy.sh", userInput)

It will flag “command injection at line 42.” It will tell you userInput is tainted. It will stop there.

It will not open deploy.sh. It will not follow $1 to wherever bash exports it. It will not notice that two lines later, deploy.sh runs terraform apply -var "name=$1". It will not link the bash variable to the Terraform variable {} block. It will not trace the value into aws_s3_bucket.example.name.

The bug is reachable. The tool just stopped looking.

What “cross-language taint” actually means

Real applications are not one language. The simplest production stack on the market has a JavaScript frontend, a Java or Go backend, a Python data worker, a bash deploy script, and Terraform or CDK provisioning the infrastructure. Each handoff between them is a place where the taint goes silent — not because the bug is not there, but because the tool’s mental model ends at the file boundary.

Ours does not. We index every supported language into the same normalized graph, with bidirectional edges for every cross-boundary handoff. One source. One sink. One unbroken hop list, no matter how many languages the data passes through to get there.

The boundaries we cross

Boundary typeWhat gets bridged
HTTP between servicesaxios.post -> @RequestBody -> requests.post -> @app.post
Subprocess / execexec.Command / Runtime.exec / subprocess.run -> bash script
Bash to CLI tool$1 -> terraform apply -var -> var.X in HCL
Filesystem bridgefs.writeFile(json) in Node -> json.load in Python
gRPC.proto contract -> generated stubs on both sides
Kafkaproducer.send(topic) -> consumer.Messages() on the consumer side
CI workflow_dispatchgithub.event.pull_request.title -> $ENV_VAR -> deploy script

Each of these is a first-class hop in the trace — not a black box where the chain dies.

A real chain, end to end

A four-language chain we routinely trace, hop by hop:

[0] entry_point       ConfigController.java:42    POST /api/config
[1] assignment        ConfigService.java:14       request.aclSetting -> aclSetting
[2] subprocess        ConfigService.java:19       ProcessBuilder("bash", "wrapper.sh", ...)
[3] cross_service     wrapper.sh:1                bash receives arg $4
[4] assignment        wrapper.sh:5                $4 -> ACL_SETTING
[5] cli_handoff       wrapper.sh:22               terraform -var "acl_setting=$ACL_SETTING"
[6] infra_flow        variables.tf:3              Terraform variable "acl_setting"
[7] infra_flow        s3.tf:5                     acl = var.acl_setting
[8] exit_point        s3.tf:5                     Infrastructure Public Exposure

Four languages. Eight hops. One unbroken trace. The kind of question that returns “we do not support that” on every other tool returns a full forensic path on ours.

Why this is the AppSec gap, not a curiosity

Real-world exploits do not respect file extensions. A SQL injection that starts as a sortBy field in a React component, gets forwarded across an HTTP call to Java, gets forwarded again to a Python analytics service, and lands in an ORDER BY f-string in query_builder.py is invisible to every single-language tool on the market.

The Java tool sees service.searchThreats() and gives up. The Python tool sees an unknown sort_by parameter and gives up. Nobody connects them.

Connect them and the bug pops out immediately.

What ships on day one

  • Twelve languages with parity across indexing, taint, control flow graphs, call graphs, and rule packs: Python, TypeScript and JavaScript, Java, Go, Rust, PHP, Bash, Vue, Svelte and SvelteKit, GitHub Actions, Terraform / HCL, and AWS CDK.
  • Validated on the OWASP Benchmark and OWASP Juice Shop corpora. Java (11/11), Python, and Juice Shop (31/31) — clean across the test set in our internal validation runs. Benchmark numbers, not workflow guarantees: your codebase is not the benchmark, and we will not pretend it is.
  • Zero configuration. aud full --offline. Drop the binary in a repo, get a queryable database back. No language servers. No compilation step. No project file to write.

Honest disclaimers

The binary is pre-launch. We are packaging the engine with SQLCipher-encrypted analysis databases and validating the compiled artifact against the same OWASP corpora the source has already passed. We ship when the adversarial scans on the compiled binary come back clean. No date promise.

One honest scope note on the cross-language story above: we analyze one project root per run. If your services live under one umbrella tree, the engine treats them as one graph and traces between them exactly as advertised. If they live in genuinely separate repositories on disk, you run the engine in each one. Multi-repo federation is on the roadmap, not in the binary yet.

What is next

Subscribe via the signup form on the main site for launch notifications. We only email when there is something real to share.