Description
There is a logic error in client.py within the PySUS.download method. After a remote file successfully finishes downloading via the underlying client, the orchestrator updates its local database state. However, it passes status=DownloadStatus.DOWNLOADING instead of marking it as finished (DownloadStatus.COMPLETED) in line 309.
Because get_local_file specifically queries for records with DownloadStatus.COMPLETED, files that are successfully downloaded are never recognized as "existing local copies". This completely breaks the caching mechanism and forces files to be re-downloaded every time. This can be visualized using a file last modified timestamp.
How to Reproduce
import os
import time
os.environ["PYSUS_CACHEPATH"] = "." # Points to project root, makes manual check easy (if needed)
from pysus.api.client import PySUS
import asyncio
async def main():
async with PySUS() as pysus:
# Query catalog
files = await pysus.query(
dataset="sinasc",
state="AC",
year=2020,
)
# Download files
for f in files:
local = await pysus.download(f)
print("Creation time: ",os.path.getmtime(local.path))
time.sleep(10) # Adds a small delay, increasing modification time if file is re-downloaded
# Attempts download again
local = await pysus.download(f)
print("Last modification: ",os.path.getmtime(local.path))
if __name__ == "__main__":
asyncio.run(main())
Expected Behavior
Once the client successfully downloads the file, the local file state tracking should update the status to DownloadStatus.COMPLETED.
Actual Behavior
The file status remains DownloadStatus.DOWNLOADING indefinitely.
Suspected Code Location
In client.py, inside the download method's try block, line 309 should be DownloadStatus.COMPLETED:
|
if timeout is not None: |
|
with anyio.fail_after(timeout): |
|
await client.download(file, local_path, callback) |
|
else: |
|
await client.download(file, local_path, callback) |
|
|
|
await self._update_state( |
|
local_path=local_path, |
|
remote_path=str(remote_path), |
|
client_name=client_name, |
|
status=DownloadStatus.DOWNLOADING, |
|
year=file.year, |
|
month=file.month, |
|
state=file.state, |
|
group=getattr(file.group, "name", None), |
|
) |
|
return await ExtensionFactory.instantiate(local_path) |
Proposed Fix
Change status=DownloadStatus.DOWNLOADING to status=DownloadStatus.COMPLETED at the end of the successful download sequence.
Description
There is a logic error in client.py within the PySUS.download method. After a remote file successfully finishes downloading via the underlying client, the orchestrator updates its local database state. However, it passes status=DownloadStatus.DOWNLOADING instead of marking it as finished (DownloadStatus.COMPLETED) in line 309.
Because get_local_file specifically queries for records with DownloadStatus.COMPLETED, files that are successfully downloaded are never recognized as "existing local copies". This completely breaks the caching mechanism and forces files to be re-downloaded every time. This can be visualized using a file last modified timestamp.
How to Reproduce
Expected Behavior
Once the client successfully downloads the file, the local file state tracking should update the status to DownloadStatus.COMPLETED.
Actual Behavior
The file status remains DownloadStatus.DOWNLOADING indefinitely.
Suspected Code Location
In client.py, inside the download method's try block, line 309 should be DownloadStatus.COMPLETED:
PySUS/pysus/api/client.py
Lines 299 to 315 in c4c643d
Proposed Fix
Change status=DownloadStatus.DOWNLOADING to status=DownloadStatus.COMPLETED at the end of the successful download sequence.