Add RDP E2E tests by bernie-g · Pull Request #230 · Infisical/cli

bernie-g · 2026-05-11T19:23:59Z

Summary

Add 6 RDP E2E subtests: connection, bad credentials, unreachable target, reconnect, concurrent connections, session duration
xrdp Docker container (Ubuntu + xrdp + openbox) as the test target
FreeRDP under xvfb for headless connection verification in CI
CI pam-test job updated: Rust bridge build, -tags rdp, FreeRDP + xvfb install
Raw HTTP helpers for Windows PAM resource/account creation (no client regen needed yet)
openapi-cfg.yaml updated with createWindowsPamResource and createWindowsPamAccount for future client regeneration

Test plan

CI pam-test job builds Rust bridge and CLI with RDP support
xrdp container starts and accepts connections
All 6 RDP subtests pass in CI
Existing SSH/Postgres/Redis tests still pass (no regressions)

# Conflicts: # packages/pam/pam-proxy.go

fix: add keepalive, idle timeout, and session cleanup to gateway

goreleaser v1.x's mode: append cannot find draft releases (GitHub's "get release by tag" API excludes drafts), so the pre-created draft was ignored and goreleaser created a second, separate release. Fix: remove the create-release-draft job and let the goreleaser jobs handle release creation themselves via mode: append. The first job to finish publishing creates the release, the others append to it. Also replaces the broken --skip=build (v2-only flag) with separate dry-run/release goreleaser steps.

fix(release): remove --skip=build flag unsupported in goreleaser v1.26.2

* feat(pam): Oracle database support via proxied-auth gateway Adds Oracle DB as the 8th PAM handler. Gateway accepts client connections with a placeholder password, proxies pre-auth bytes verbatim to upstream, intercepts at O5Logon to swap client-supplied placeholder-keyed material for real-password-keyed material, and byte-relays post-auth. Credential injection works end-to-end for JDBC thin clients (sqlcl, SQL Developer, DBeaver) and go-ora; user never sees real Oracle credentials. Handler lives in packages/pam/handlers/oracle/. Ports crypto primitives (PBKDF2+SHA512, AES-CBC session-key encryption, PKCS5 padding) and TTC codec (compressed ints, CLR byte strings, KVP encoding) from MIT-licensed sijms/go-ora. See ATTRIBUTION.md and ORACLE_PAM_NOTES.md for architecture notes and handoff. Known dead code remains in this commit from the earlier full- impersonation attempt (ano.go, nego.go, nego_templates.go, parts of o5logon*.go, upstream.go, handshake_test.go). Kept intact to preserve history of the approach; cleanup follows as a separate commit. * refactor(pam-oracle): remove impersonation-era dead code The initial attempt used server-side Oracle impersonation (see prior commit). That design worked through authentication but hit a state-mismatch problem post-auth: upstream (via go-ora) and client negotiated different TTC capabilities, so relayed queries were rejected as protocol violations. The replacement — proxied-auth — is already in proxy_auth.go and is the flow wired to HandleConnection. This commit removes the dead files and vestigial symbols that supported the old path: Removed entirely: - ano.go, nego.go, nego_templates.go (pre-auth TNS/TTC negotiation handlers; pre-auth is now forwarded verbatim to upstream) - upstream.go (go-ora-based upstream dial + KVP extraction; replaced by dialUpstreamRaw in proxy_auth.go) - handshake_test.go (tested the impersonation path, orphaned) Pruned: - proxy.go: handleConnectionLegacy (~200 LOC) - o5logon.go: O5LogonServerState, NewO5LogonServerState, VerifyClientPassword, deriveKey11g, md5Hash, parseIntVal - o5logon_server.go: AuthPhaseOne, ParseAuthPhaseOne, BuildAuthPhaseOneResponse, BuildAuthPhaseTwoResponse, BuildAuthPhaseTwoResponseFromUpstream, RunServerO5Logon, dumpBytes, readUint32 - tns.go: AcceptPacket, AcceptFromConnect, ConnectPacket, ParseConnectPacket, MarkerPacketBytes (we forward raw packet bytes rather than parse/build CONNECT or ACCEPT) Kept: crypto primitives, DATA packet codec, TTC reader/builder, query logger, prependedConn, error helpers — all live in the proxied-auth flow. Drops github.com/sijms/go-ora/v2 from go.mod — no longer imported. Adds .idea and .vscode to .gitignore. Updates ORACLE_PAM_NOTES.md with a current-state header; historical sections below retained for context. Net: ~1,600 LOC removed. The handler directory goes from 12 files to 8. Build, fmt, and the Oracle SQL test matrix (SELECT, INSERT, DDL, PL/SQL, bind vars, NLS queries) still pass against sqlcl → gateway → AWS RDS Oracle 19c. * fix(pam-oracle): capture SQL in session recordings The TTC query extractor was logging empty strings for OALL8 payloads because tryExtractSQL's "skip 6 compressed ints then read a CLR" heuristic didn't land on the SQL text — the OALL8 wire format has variable-length headers that differ by client driver and bind pattern. As a result, session recordings contained only session headers (124 bytes) with no actual query content. Replace structured parsing with a simple scan for the longest printable ASCII run in the payload. In practice the SQL text is always the longest such run. Verified with sqlcl: .enc file grows from 124 bytes (empty) to ~880 bytes (with captured queries) for a SELECT + COMMIT session. This only affects content — the tap, packet demultiplexing, and encrypted file I/O were all working correctly. Fix is localised to tryExtractSQL. * fix(pam-oracle): capture queries bundled behind piggyback OCLOSE sqlcl (and other JDBC thin clients) frequently bundle a piggybacked OCLOSE for the previous cursor with the next OALL8 query in a single TTC packet. The previous parser checked byte 0 for 0x03 (function call) and bailed when it saw 0x11 (piggyback marker), missing the OALL8 underneath. Scan the payload for the function-call + OALL8 byte pair instead, so the parser finds the query regardless of any preceding piggyback prefix. Same treatment for COMMIT and ROLLBACK, which also get piggybacked. * feat(pam-oracle): TLS (TCPS) support for upstream Matches the other SQL handlers' pattern: the client speaks plain TCP to our local listener, we do TLS to the upstream database and translate in the middle. No change to the client-facing UX — the same JDBC URL that works against a plain-TCP Oracle resource now also works against a TCPS-enabled one. Implementing this correctly required discovering Oracle TCPS's two-handshake flow (by reading go-ora's network/session.go readPacket RESEND branch — credit there): 1. Dial TCP, do an initial TLS handshake. Forward the client's CONNECT through this first TLS session. 2. When the upstream RESEND packet's byte-5 flag has 0x08 set, Oracle expects the client to abandon the first TLS session and run a FRESH TLS handshake on the bare TCP socket. Server drops its first-round TLS state in lockstep. We mirror this via upgradeToTLS on the same rawConn, then continue the Oracle handshake through the new session. 3. Mask byte 5 on packets going downstream so thin clients (JDBC thin, python-oracledb thin) don't see the 0x08 signal and try to cast their local TcpNTAdapter to TcpsNTAdapter — the cast would fail because the client-to-proxy socket is plain TCP. 4. Accept TLS 1.0 as the floor for the upstream dial: Oracle 19c's second-round handshake negotiates down to 1.0 in some configurations (AWS RDS's SSL option being one of them). The outer ALPN mTLS tunnel remains TLS 1.2+. Also removes now-dead SSLRejectUnauthorized / SSLCertificate fields from OracleProxyConfig — the shared TLSConfig built in pam-proxy.go carries that information already. Verified end-to-end against AWS RDS Oracle 19c (SSL option, port 2484) with sqlcl: authentication, DDL, DML+COMMIT, PL/SQL, DBMS_OUTPUT, bind variables, and session recording all work. Plain-TCP Oracle path is unchanged. * chore(pam-oracle): tighten TCPS TLS config comment Pure comment cleanup on buildOracleTLSConfig — no behavior change. * chore(pam-oracle): update stale comments + attribution - ATTRIBUTION.md: drop reference to nego.go (deleted in the impersonation-era cleanup) and add the upstream TCPS two-handshake flow adaptation from go-ora's session.readPacket RESEND branch. - o5logon_server.go: the file header still described the impersonation-era architecture where the gateway acted as an Oracle server and drove the O5Logon exchange. Rewrite it to describe the current proxied-auth role — packet-layer helpers used by the byte- level O5Logon translation in proxy_auth.go — and drop the reference to upstream.go, which was removed in 3ff9cff. * chore(pam-oracle): fix go vet warnings in proxy_auth.go Two warnings, both from the original feat commit (4b25ec2): - Dead self-assignment state.ServerSessKey = state.ServerSessKey on the phase-2 request translation path. The inline comment "no-op; kept for clarity" already admitted it was pointless. Delete. - Unreachable _ = prefix after a return statement inside replaceKvpValueKeepingSize. The prefix variable was left over from a refactor — the new code uses oldStart / oldEnd instead — and the _ = prefix trick to silence "declared but not used" landed on the wrong side of the return. Delete both the dead prefix declaration and the unreachable suppression. Also drops valStart (only used by the dead prefix computation). No behavior change; go vet is now clean on the oracle handler. * chore(pam-oracle): drop TNS_ADMIN dance, shorten placeholder password Two UX simplifications to the Oracle PAM access command: 1. Stop creating a per-session TNS_ADMIN directory and printing an `export TNS_ADMIN=...` line in the connect instructions. The directory only existed to hold a sqlnet.ora that set DISABLE_OOB=TRUE — a defence against sqlcl's out-of-band Ctrl-C signalling that we never actually observed breaking. If a real interrupt problem surfaces we can revisit with a proper fix instead of a per-session file dance. 2. Change ProxyPasswordPlaceholder from "infisical-pam-proxy" to "password". The string value is cryptographically arbitrary — Oracle's O5Logon needs the client and gateway to agree on SOME string; any works. Shorter is easier to copy-paste. The accompanying "not a real credential" note in the CLI output stays. * chore(pam-oracle): drop unrelated changes from branch Three items that snuck in but aren't part of Oracle PAM: 1. Untrack ORACLE_PAM_NOTES.md. It's a development notes/research log that belongs with scratch work, not in the branch. The file stays on disk (uncommitted) for local reference. Its content is also stale since the placeholder password changed from "infisical-pam-proxy" to "password" in 3ee7951 without matching updates to the notes. 2. Revert the github.com/emirpasic/gods indirect bump from v1.12.0 to v1.18.1 in go.mod/go.sum. This was residue from when go-ora/v2 was temporarily added as a direct dep in the initial feat commit — go-ora needed the newer gods, and Go module MVS held the bumped version even after go-ora was removed in 3ff9cff. Running `go mod tidy` against current HEAD (with main's go.mod/go.sum as the baseline) produces no further changes, confirming nothing we ship actually needs v1.18.1. .gitignore additions (.vscode, .idea) stay — reasonable hygiene that doesn't hurt the PR and will prevent future editor-file noise. * fix(pam-oracle): AI-review findings + username parity with other handlers - Password plaintext no longer embedded in "password mismatch" error (was bubbling to gateway logs via zerolog's .Err chain). - Long Oracle passwords (≥ 96 chars) now encode correctly: replaceKVPValue routes AUTH_PASSWORD through TTCBuilder.PutClr so values above the short-form threshold emit the 0xFE chunked form instead of a truncated single-byte length. - Client-supplied username rewritten to InjectUsername in phase-1 and phase-2 auth requests. Matches the effect of how the postgres/mysql/mssql handlers overwrite the startup-packet user — the client's choice becomes inert; upstream always looks up the configured account's verifier. - Dead / misleading code removed: local PacketTypeResendMarker constant that duplicated tns.go's PacketTypeResend, package-level min() shadowing the Go 1.21+ builtin, the if !use32Bit branch in extractDataPayload where both arms assigned the same value, and the now-unused encodeCompressedInt helper. * chore(pam-oracle): remove dead code + tighten attribution Static-analysis sweep (staticcheck + manual cross-file grep) across the oracle handler package. All removals are symbols that were defined but never referenced anywhere: - tns.go: markerTypeReset, markerTypeInterrupt constants. - o5logon_server.go: TTCMsgAuthResponse, TTCMsgBreak, LogonModeUserAndPass, LogonModeNoNewPass constants; AuthPhaseTwo fields ESpeedyKey / AlterSession / ClientInfo / LogonMode (parser wrote them, no reader downstream); ParseAuthPhaseTwo's KVP switch trimmed to the two keys actually consumed (AUTH_SESSKEY, AUTH_PASSWORD). - ttc.go: TTCReader.GetNullTermString(), TTCReader.SetUseBigClrChunks() — both uncalled. Also added a proper attribution header to o5logon_server.go (it was adapting go-ora's phase-2 layout + summary-object format but lacked the same kind of header the other ported files have), and expanded ATTRIBUTION.md to cover it plus the specific primitives borrowed by o5logon.go. No behavior change. go build / go vet / staticcheck clean on the package. * chore(pam-oracle): remove dead verifier-type and error-code constants Exhaustive dead-export sweep via `go doc -all` (a more reliable check than my earlier regex-based grep, which missed untyped constants). Removed: - VerifierType10g / VerifierType11g / VerifierType12c — defined but no callers; the only verifier type our code implements (18453, 12c+ PBKDF2+SHA512) is hardcoded, the three named constants were never referenced. The 18453-specific comments in the code retain the documentation. - ORA12660EncryptionRequired — defined but no callers. Only ORA1017InvalidCredentials is actually emitted. Post-sweep: every exported symbol in the oracle handler package has a call site. `go build` / `go vet` / `staticcheck` clean. * chore(pam-oracle): remove dead enum-block const members staticcheck's U1000 has an exemption for const blocks: if any member is used, sibling members aren't flagged as unused even when they are. An exhaustive sweep that enumerates every top-level identifier via manual grep (so const-block membership is irrelevant) caught the ones staticcheck skipped. Removed: - tns.go: PacketTypeAbort, PacketTypeAck, PacketTypeAttn, PacketTypeCtrl, PacketTypeNull. Only the 7 PacketType values we actually dispatch on remain. - query_logger.go: ttcFuncOFETCH, ttcFuncOCLOSE, ttcFuncOSTMT, ttcFuncOLOGOFF, ttcMsgPiggyback. Only the 4 TTC opcodes the query tap actually looks for remain (OALL8, OCOMMIT, ORLLBK, the outer msgFunction). Post-sweep: exhaustive enumeration finds zero unused symbols across the package (189 candidates checked). go build / go vet / staticcheck clean. * fix(pam-oracle): show Infisical account name (not real DB user) in the banner The banner was printing the real upstream DB username (pamResponse.Metadata["username"]) in the connection URL, even though the preceding "Account:" label already shows the Infisical account name. Since the gateway now rewrites the client-supplied username in the O5Logon exchange to the configured real user, the client can (and should) connect using the account name — and the banner makes that explicit. Before: Resource: aws-oracledb Account: admin2 oracle://admin:password@localhost:53521/DATABASE ← confusing After: Resource: aws-oracledb Account: admin2 oracle://admin2:password@localhost:53521/DATABASE ← matches the label Scope: Oracle only for now. The postgres/mysql/mssql handlers also overwrite the client username on the wire, so the same banner change would work there, but that needs a separate verification pass per dialect before we extend it. * fix(pam-oracle): inject SERVICE_NAME from config into CONNECT packet The client's CONNECT description string was forwarded unchanged, requiring users to know the real Oracle service name. Now we rewrite SERVICE_NAME to match InjectDatabase from the vault config, consistent with how username and password are already injected. * fix(pam-oracle): remove placeholder password verification in phase-2 The gateway no longer checks whether the client sent the placeholder password "password". It unconditionally encrypts the real password from the vault, regardless of what the client typed. Auth will succeed either way since we inject the real credentials. The placeholder password is still shown to the user in the CLI banner and still used for key derivation in phase-1 — only the verification check is removed. * fix(pam-oracle): forward phase-2 response directly, remove SVR_RESPONSE regen AUTH_SVR_RESPONSE is encrypted with encKey, which is derived from session keys + CSK salt — not the password. The client and Oracle derive the same encKey, so Oracle's original proof is already valid for the client. No need to regenerate it. Removes translatePhase2Response, BuildSvrResponse, and the placeholderEncKey field from ProxyAuthState. * fix(pam-oracle): replace 3s timeout peek with deterministic supplement drain The post-ACCEPT supplement peek used a 3-second read deadline to detect whether a go-ora client sent connect-data as a separate packet. This added a fixed 3s delay for every non-go-ora client (sqlcl, JDBC thin). Instead, check the CONNECT packet structure: if connect-data-length + connect-data-offset exceeds the packet size, the data wasn't inline and a supplement will follow. Track whether the RESEND handler already consumed it; if not, do a blocking read (the supplement is guaranteed to be in the TCP buffer since the client sent it before waiting for a response). Removes prependedConn, detectConnectDataSupplement, and the timeout. * chore(pam-oracle): strip comments, enforce InjectDatabase, fix ora() formatting - Remove verbose comments across all files (~20% → ~2% comment rate) - Remove per-file go-ora attribution (ATTRIBUTION.md carries the license) - Trim ATTRIBUTION.md to just the copyright notice + MIT text - Make InjectDatabase mandatory (error if empty, always overwrite client's SERVICE_NAME) - Format unknown Oracle error codes as ORA-XXXXX instead of bare "ERROR" - Clarify ProxyPasswordPlaceholder as a decoy * chore(pam-oracle): remove redundant banner comment and auth note * fix(pam-oracle): forward connect-data supplement before handshake loop JDBC thin sends the CONNECT description as a separate follow-up packet when it doesn't fit inline (74-byte header-only CONNECT). Previously we drained this supplement only after ACCEPT, relying on the server to RESEND. AWS RDS sends RESEND, but OCI Autonomous DB does not — it waits for the connect-data and eventually drops the connection. Fix: check the CONNECT packet for non-inline data (connect-data-offset + connect-data-length > packet size) and forward the supplement immediately. Also forward it again after a RESEND+TLS-restart, since the client re-sends CONNECT + supplement over the new TLS session. * fix(pam-oracle): restore placeholder password check for clearer errors Wrong password fails cryptographically anyway (ORA-17452 from garbage decryption), but the error is confusing. Restore the explicit check so the client gets a clean ORA-01017 instead. * fix(pam-oracle): show real DB username in banner, matching other handlers * chore(pam-oracle): update ProxyPasswordPlaceholder comment * chore(pam-oracle): simplify ProxyPasswordPlaceholder comment * chore(pam-oracle): rename ResourceTypeOracle to ResourceTypeOracledb * fix(pam-oracle): cap handshake loop, zero keys after auth, remove dead comment - Add maxHandshakeAttempts (10) to prevent infinite RESEND loops - Zero RealKey, PlaceholderKey, ServerSessKey after auth completes - Remove leftover debug comment from proxy_auth.go * chore(pam-oracle): simplify key cleanup to state = nil * fix(pam-oracle): show Easy Connect and JDBC connection strings in banner --------- Co-authored-by: saif <11242541+saifsmailbox98@users.noreply.github.com>

* debug(pam): add diagnostic logging for SSH resource creation flake Add detailed logging when CreateSshPamResource returns non-200 to diagnose the intermittent 400 error seen only in CI. On failure, dumps: - Response body (the actual backend error message) - Request params (gatewayId, projectId, host, port) - Relay/gateway process status and stderr/stdout - Gateway API state (heartbeat, name) - Backend container logs (last 3000 chars) Also temporarily scopes CI to only PAM SSH tests in a loop (up to 10 runs, stops on first failure) to maximize chance of catching the flake. * fix(pam): use localhost for SSH resource host to avoid iptables race The SSH e2e tests were using getOutboundIP() (e.g. 10.1.0.34) as the resource host. The gateway dials this address to validate the connection during resource creation. On Linux CI with native Docker, this path goes through iptables DNAT rules which can have a brief propagation delay after container start — causing intermittent "connection refused" even though testcontainers confirmed the port was listening (via localhost). CI error: "Unable to validate connection to ssh: Connection lost before handshake" with gateway log: "dial tcp 10.1.0.34:32774: connect: connection refused" Fix: use localhost instead. The gateway runs on the host, so localhost:port reaches docker-proxy directly on the loopback interface without needing iptables DNAT. Diagnostic logging kept in place to catch any other failure modes. * debug(ci): add sleep between loop iterations for port release The relay binds to hardcoded ports (8443, 2222). When running multiple test iterations in a loop, the previous relay's ports may still be in TIME_WAIT when the next iteration starts, causing "Relay is reachable" to never appear. Add 15s sleep between iterations. * experiment: revert to getOutboundIP to confirm flake reproduces in CI Reverting the localhost fix to verify that getOutboundIP triggers the iptables race ("connection refused") in CI. If this fails, it confirms the root cause. Will re-apply the fix afterward. * fix(pam): re-apply localhost fix after confirming getOutboundIP flake Experiment confirmed: - getOutboundIP (10.1.0.168): FAILED run 1 — "dial tcp 10.1.0.168:32773: connect: connection refused" (iptables DNAT race) - localhost: 20/20 passed across two CI runs Re-applying the localhost fix. * experiment: second run with getOutboundIP to confirm reproducibility * fix(pam): re-apply localhost fix — confirmed root cause with A/B test A/B experiment in CI confirms getOutboundIP causes the flake: getOutboundIP (external IP): - Experiment 1: FAILED run 1 (password, 10.1.0.168:32773) - Experiment 2: FAILED run 4 (certificate, 10.1.0.114:32799) - Error: "dial tcp <external-ip>:<port>: connect: connection refused" localhost: - 3 CI runs: PASSED 25/25 consecutively Root cause: on Linux CI with native Docker, port mappings via external IP go through iptables DNAT rules which have a brief propagation delay after container start. localhost bypasses this via docker-proxy on loopback. * fix(pam): finalize — restore CI workflow, keep diagnostics - Restore all CI jobs (CLI, Agent, PAM) to normal single-run mode - Run full PAM test suite (not just SSH) - Keep diagnostic dump on createSSHPamResource failure - Log resourceHost value so any future failure shows which path was used * test: validation loop with log correlation check Runs SSH tests 5 times with localhost fix, then greps all logs for "connection refused" to verify the error is completely absent. With getOutboundIP: "connection refused" appeared in gateway logs right before every failure. With localhost: should see zero occurrences across all 5 runs. * fix(pam): use localhost for SSH test resource host + add failure diagnostics Root cause: SSH e2e tests used getOutboundIP() (e.g. 10.1.0.168) as the resource host. During backend's validateConnection(), the gateway dials this address to reach the SSH container. On Linux CI with native Docker, this path goes through iptables DNAT rules which have a brief propagation delay after container start. testcontainers confirms the port via localhost (docker-proxy on loopback), but the iptables path isn't ready yet — causing "connection refused" at the gateway. Fix: use "localhost" instead. The gateway runs on the Docker host, so localhost reaches the port mapping directly via docker-proxy without needing iptables DNAT. Verified with A/B experiment in CI: getOutboundIP: FAILED 2/2 runs ("connection refused" in gateway logs) localhost: PASSED 30/30 runs (zero "connection refused" in logs) Also adds diagnostic dump on createSSHPamResource failure (response body, relay/gateway logs, backend container logs, gateway API state) for visibility into any future failure modes. * fix(pam): get SSH resource host from container.Host() like Postgres/Redis Use container.Host(ctx) instead of getOutboundIP() or hardcoded "localhost". This is the same pattern Postgres and Redis tests use, and returns the correct Docker daemon host (localhost on Linux, which routes through docker-proxy without iptables DNAT races). * fix(pam): use container.Host() for SSH resource host, remove debug code Get the SSH resource host from container.Host(ctx) — same as Postgres and Redis tests — instead of getOutboundIP(). On Linux CI, Host() returns localhost which routes through docker-proxy on loopback, avoiding the iptables DNAT propagation race that caused intermittent "connection refused" when using the host's external IP. Removes all diagnostic logging added during investigation. * chore: restore original workflow file (no changes needed) * debug: log response body on createSSHPamResource failure * debug: SSH-only loop to catch second flake mode with response body logging * ci: retrigger loop to catch rare second flake * debug: add gateway/relay stderr dump on failure to catch second flake * ci: retrigger to catch rare flake (attempt 3) * fix(pam): wait for sshd "Server listening" log, not just port The SSH container's port 22 becomes listenable before sshd is fully ready to accept connections. testcontainers' ForListeningPort confirms the port is open, but sshd may still be generating host keys or loading config. Connections during this window get "connection reset by peer" (0 bytes from service), which the backend sees as "Connection lost before handshake". Fix: use ForAll(ForListeningPort, ForLog("Server listening")) to wait for both port availability AND sshd's startup confirmation log. * ci: retrigger for confidence * fix(pam): use SSH handshake to verify sshd readiness after SIGHUP After configureCertAuth sends SIGHUP to reload sshd config, the test checked readiness with a TCP dial. But sshd never closes its listen socket during SIGHUP, so the port is always reachable — the dial passes instantly even while sshd is still processing the config reload. Connections during this window get accepted then dropped, causing "Connection to 127.0.0.1 closed by remote host". Fix: do an actual SSH handshake (ssh.Dial with password auth) instead of a TCP dial. This confirms sshd is fully serving connections, not just listening. * ci: retrigger confidence run * fix(pam): fix three SSH e2e test flakes 1. Use container.Host(ctx) for resource host instead of getOutboundIP(). On Linux CI, external IPs route through iptables DNAT which has a propagation delay after container start. 2. Wait for sshd "Server listening" log, not just port open. ForListeningPort passes before sshd finishes key generation, causing connection resets. 3. Use SSH handshake to verify sshd readiness after SIGHUP config reload. sshd never closes its listen socket during SIGHUP, so TCP dial passes instantly while sshd is still reloading. --------- Co-authored-by: saif <11242541+saifsmailbox98@users.noreply.github.com>

6 subtests: connection, bad credentials, unreachable target, reconnect, concurrent connections, and session duration. Uses an xrdp container as the target and FreeRDP under xvfb for headless verification. CI pam-test job updated to build the Rust bridge and CLI with -tags rdp.

infisical-review-police · 2026-05-11T19:24:02Z

💬 Discussion in Slack: #pr-review-cli-230-add-rdp-e2e-tests

Posted by Review Police — reviews, comments, new commits, and CI failures will stream into this channel.

The backend requires a pam_project_recording_configs row to exist before allowing Windows resource creation. Insert a dummy row directly into the database, bypassing FK checks since we don't need a real app connection for the E2E tests.

x032205 and others added 12 commits May 8, 2026 04:04

fix: detect and clean up dead gateway connections to prevent OOM

71aba63

codex review fix

786fcd7

address claude review

f9dd5b6

Merge branch 'main' into PAM-205

3ccfec6

# Conflicts: # packages/pam/pam-proxy.go

move session cleanup to only run if a session was found and cancelled

19a086f

tiny noise fix

268a8e6

Merge pull request #223 from Infisical/PAM-205

171b125

fix: add keepalive, idle timeout, and session cleanup to gateway

Merge pull request #225 from Infisical/fix/goreleaser-skip-build

fc0ac97

fix(release): remove --skip=build flag unsupported in goreleaser v1.26.2

bernie-g added 2 commits May 11, 2026 15:56

Merge remote-tracking branch 'origin/main' into feat/pam-rdp-tests

a46575e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RDP E2E tests#230

Add RDP E2E tests#230
bernie-g wants to merge 14 commits into
feat/pam-rdp-recordingfrom
feat/pam-rdp-tests

bernie-g commented May 11, 2026

Uh oh!

infisical-review-police Bot commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bernie-g commented May 11, 2026

Summary

Test plan

Uh oh!

infisical-review-police Bot commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants