Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion client/model/snapshot.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ type TakeSnapshotRequest struct {
}

type TakeSnapshotResponse struct {
Message string `json:"message,omitempty"`
UUID string `json:"uuid"`
SnapKey string `json:"snapKey"`
}
14 changes: 14 additions & 0 deletions client/vm.go
Original file line number Diff line number Diff line change
Expand Up @@ -183,3 +183,17 @@ func (c *CoreClient) ForceShutdownVM(ctx context.Context, req model.ForceShutdow
}
return response, nil
}

// TakeExternalSnapshot asks Core to take a snapshot and upload it to RustFS via the given presigned PUT URL.
// Uses a dedicated HTTP client with a 12-minute timeout because snapshot creation + upload can take several minutes.
func (c *CoreClient) TakeExternalSnapshot(ctx context.Context, req model.TakeSnapshotRequest) (model.TakeSnapshotResponse, error) {
snapClient := &CoreClient{
baseURL: c.baseURL,
client: &http.Client{Timeout: 12 * time.Minute},
}
var response model.TakeSnapshotResponse
if err := snapClient.doRequest(ctx, http.MethodPost, "/TakeExternalSnapshot", req, &response); err != nil {
Comment on lines +189 to +195

@coderabbitai coderabbitai Bot Jun 28, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Redact the presigned URL before this request goes through doRequest.

model.TakeSnapshotRequest includes PresignedURL, and doRequest logs the full JSON body at client/vm.go Line 41. This new path would write a live upload credential into logs for every snapshot request unless that body logging is skipped or redacted first.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@client/vm.go` around lines 189 - 195, The TakeExternalSnapshot path is
sending model.TakeSnapshotRequest into doRequest without redacting PresignedURL,
which will leak the presigned upload URL in the request-body logs. Update
CoreClient.TakeExternalSnapshot to sanitize or copy-and-clear the PresignedURL
before calling doRequest, or bypass the body logging in doRequest for this
request, so the JSON logged from the TakeExternalSnapshot flow no longer
contains the credential.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12분 정도의 긴 Timeout 방식보다 webhook 방식이 연결 자원 점유 문제나 디버깅 상태 확인 그리고 안정성에서는 더 나은 해결책이 될 수 있을 것 같습니다.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ukleon123 맞습니다. 장기적으로는 webhook/async 방식이 말씀하신 것처럼 연결 자원 점유, 진행 상태 확인, 재시도/복구 측면에서 더 나은 방향으로 보입니다.

다만 이번 코멘트의 핵심은 timeout 전략과는 별개로, 현재 구현에서는 model.TakeSnapshotRequest.PresignedURLclient/vm.go Line 41의 request-body 로그에 그대로 남는 점입니다. 그래서 지금 구조를 유지하더라도 이 값은 마스킹하거나 로그에서 제외하는 처리가 필요합니다.

즉:

  • 중장기 개선안: webhook/비동기 전환
  • 이번 PR의 최소 안전 조치: TakeExternalSnapshot 경로에서 PresignedURL 로그 유출 방지

원하시면 webhook 전환은 후속 이슈로 분리하고, 이 PR에서는 먼저 presigned URL 로깅만 막는 방향으로 보는 게 좋겠습니다.

return model.TakeSnapshotResponse{}, err
}
return response, nil
}
13 changes: 9 additions & 4 deletions service/external_snap.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"time"

"github.com/easy-cloud-Knet/KWS_Control/client"
"github.com/easy-cloud-Knet/KWS_Control/client/model"
vms "github.com/easy-cloud-Knet/KWS_Control/structure"
)

Expand Down Expand Up @@ -52,9 +53,13 @@ func TakeSnapshot(uuid vms.UUID, snapName string, ctx *vms.ControlContext) error
return fmt.Errorf("TakeSnapshot %s: failed to generate presigned URL: %w", uuid, err)
}

// TODO: coreClient.TakeSnapshot(context.Background(), model.TakeSnapshotRequest{
// UUID: uuid, SnapKey: snapName, PresignedURL: presignedURL,
// })
_ = presignedURL
coreClient := client.NewCoreClient(core)
if _, err := coreClient.TakeExternalSnapshot(context.Background(), model.TakeSnapshotRequest{
UUID: uuid,
SnapKey: snapName,
PresignedURL: presignedURL,
}); err != nil {
return fmt.Errorf("TakeSnapshot %s: core failed: %w", uuid, err)
}
Comment on lines +56 to +63

@coderabbitai coderabbitai Bot Jun 28, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift

Propagate the caller context into the Core snapshot call.

This new 12-minute request is issued with context.Background(), so cancellation from POST /vm/snapshot will not abort the Core work. The same detachment already exists at service/external_snap.go Line 51, so the fix belongs at TakeSnapshot's signature and callers rather than only at this line.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@service/external_snap.go` around lines 56 - 63, The Core snapshot request is
being started with a detached background context, so cancellation from the
caller cannot stop the work. Update TakeSnapshot to accept and pass through the
caller’s context, then thread that context from its callers instead of using
context.Background(); use the existing TakeSnapshot and TakeExternalSnapshot
call sites to locate the change and keep the error handling unchanged.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code rabbit이 말해준 영역이긴 하지만 Timeout 설정을 Background context에서 수행하는 방식보다는 timeout context를 사용하는 것이 나을것으로 보이긴 합니다.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ukleon123 맞습니다. 핵심은 Background()로 분리하지 않는 것이고, timeout이 필요하다면 말씀하신 것처럼 caller context를 그대로 전파한 뒤 그 위에 context.WithTimeout(...)을 씌우는 방식이 더 자연스럽습니다.

즉, 이상적인 형태는:

  • TakeSnapshot이 호출자의 ctx를 받음
  • ctxWithTimeout, cancel := context.WithTimeout(ctx, 12*time.Minute) 처럼 파생
  • ctxWithTimeoutTakeExternalSnapshot까지 전달

이렇게 하면:

  • 상위 요청 취소가 그대로 전파되고
  • snapshot 작업의 최대 시간도 명시적으로 제한할 수 있습니다

반대로 context.Background()에서 시작하면 상위 요청과 분리되어 취소 전파가 끊기는 점이 문제라, 그 부분은 그대로 유효한 우려였습니다.
현재 http.Client.Timeout만으로도 상한은 걸리지만, 취소 전파 관점에서는 timeout context 쪽이 더 낫다는 의견에 동의합니다.

(•ᴗ•)◞🐇


✏️ Learnings added
Learnt from: ukleon123
Repo: easy-cloud-Knet/KWS_Control PR: 70
File: service/external_snap.go:56-63
Timestamp: 2026-06-29T07:14:15.915Z
Learning: In the KWS_Control repository, for long-running operations such as external snapshot requests, prefer deriving `context.WithTimeout(...)` from the caller-provided context instead of starting from `context.Background()`, so cancellation from the upstream request is preserved while still enforcing a maximum duration.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

return nil
}