10. Tests
Framework
JUnit 4.13.2 on RandomizedTesting runner 2.8.2. Base class ESTestCase in test/framework. Lucene test framework for index-level tests.
Test tiers
| Tier | Gradle task | Source dir | What it proves |
|---|---|---|---|
| Unit | test | src/test/java | Class-level logic with mocks |
| Internal cluster | internalClusterTest | src/internalClusterTest/java | Multi-node in one JVM (ESIntegTestCase) |
| YAML REST | yamlRestTest | src/yamlRestTest/ | REST API contract from JSON spec |
| Java REST | javaRestTest | src/javaRestTest/java | Black-box cluster via HTTP |
| BWC | bwcTest | distribution/bwc/ | Upgrade from old ES versions |
Commands
./gradlew check # full verification ./gradlew test # all unit tests ./gradlew :server:test --tests 'org.elasticsearch.package.ClassName' ./gradlew internalClusterTest ./gradlew :rest-api-spec:yamlRestTest ./gradlew run # run ES from source ./gradlew localDistro # build tarball
What tests reveal
- YAML REST tests are the authoritative REST behavior spec — if YAML passes, API contract is correct.
ESIntegTestCasetests prove cluster state transitions (allocation, recovery) end-to-end.InternalEngineTestsspecify durability, seq_no, and conflict semantics.muted-tests.ymllists flaky tests temporarily excluded from CI — signals fragile areas.
Likely gaps
- Full ESS cloud/stateless object-store paths — mostly x-pack QA, not vanilla OSS tests.
- Extreme scale (millions of shards) — benchmarked, not CI-tested.
- Exhaustive failure injection for network partitions — some in
qa/, not complete.
11. Build, run, deploy
Build
export JAVA_HOME=/path/to/jdk-21 ./gradlew localDistro # platform tarball → distribution/archives/ ./gradlew :distribution:archives:linux-tar:assemble ./gradlew buildDockerImage
Evidence: BUILDING.md, TESTING.asciidoc, CONTRIBUTING.md.
Run locally
./gradlew run # default: basic license, security ON, elastic:password ./gradlew run -Dtests.es.xpack.security.enabled=false ./gradlew run --debug-jvm
Deploy
- Tar/zip/deb/rpm from
distribution/ - Docker images via
distribution/dockerGradle tasks - Elastic Cloud uses separate orchestration;
statelessx-pack plugin targets object-store architecture - CI: Buildkite (
.buildkite/) + GitHub Actions
12. Most important files to understand first
13. Things that are confusing or risky
- Master vs coordinating vs data roles — any node can coordinate searches; only master-eligible nodes vote; data nodes hold shards.
- Bulk-only index path — single index is wrapped as bulk; debugging must trace through
BulkOperation. - Async everywhere — stack traces cross thread pools; use action tracing / slow logs.
- Wire protocol evolution — mixed-version clusters require
TransportVersionguards on every new field. - Multi-project migration — APIs and
ClusterStatestill transitioning; watch@FixForMultiProject. - Stateless vs classic — x-pack stateless replaces local durability with object store; completely different engine path (
StatelessPlugin). NodeConstructionsize — 1800+ lines; hard to navigate; use IDE structure view filtered by package.- Test flakiness — check
muted-tests.ymlbefore assuming your change broke CI.
14. Suggested learning path
- Read
README.asciidoc+CONTRIBUTING.mdfor build/run. - Trace
Elasticsearch.main→Node.start(startup mental model). - Pick one REST call: set breakpoint in
RestIndexAction, follow intoInternalEngine.index. - Read
ClusterStatejavadoc + one executor (MetadataCreateIndexService). - Run
./gradlew :server:test --tests InternalEngineTestsand read tests as spec. - Trace search:
RestSearchAction→SearchService.executeQueryPhase. - Read
IndicesClusterStateServicefor allocation → shard creation link. - Explore one module (
lang-painless) to learn plugin SPI. - Skim x-pack
core+securityif touching auth. - Read YAML REST test for your API in
rest-api-spec.
15. Open questions
Items not fully determined from static code review alone:
- Production ESS topology — exact orchestration of stateless index/search roles in cloud (code exists in x-pack stateless but deployment wiring is external).
- Default thread pool sizes — depend on node specs and settings; need runtime config docs +
ThreadPoolsettings at deploy time. - Feature flag defaults — some behavior gated by
FeatureServicecluster features; defaults change between minor versions. - Exact publication latency SLOs — implementation has timeouts but operational targets are product/SLA, not in repo.
- Complete entitlement policy surface — evolving with JDK versions; need running JVM to enumerate effective policies.
Full pinned source: github.com/elastic/elasticsearch @ f96bad22