Skip to content

Bug: PySUS.download sets status to DOWNLOADING instead of COMPLETED on successful download #285

Description

@morsoletodev

Description

There is a logic error in client.py within the PySUS.download method. After a remote file successfully finishes downloading via the underlying client, the orchestrator updates its local database state. However, it passes status=DownloadStatus.DOWNLOADING instead of marking it as finished (DownloadStatus.COMPLETED) in line 309.

Because get_local_file specifically queries for records with DownloadStatus.COMPLETED, files that are successfully downloaded are never recognized as "existing local copies". This completely breaks the caching mechanism and forces files to be re-downloaded every time. This can be visualized using a file last modified timestamp.

How to Reproduce

import os
import time

os.environ["PYSUS_CACHEPATH"] = "."  # Points to project root, makes manual check easy (if needed)

from pysus.api.client import PySUS
import asyncio


async def main():
    async with PySUS() as pysus:
        # Query catalog
        files = await pysus.query(
            dataset="sinasc",
            state="AC",
            year=2020,
        )

        # Download files
        for f in files:
            local = await pysus.download(f)
            print("Creation time: ",os.path.getmtime(local.path))

            time.sleep(10)  # Adds a small delay, increasing modification time if file is re-downloaded

            # Attempts download again
            local = await pysus.download(f)
            print("Last modification: ",os.path.getmtime(local.path))


if __name__ == "__main__":
    asyncio.run(main())

Expected Behavior

Once the client successfully downloads the file, the local file state tracking should update the status to DownloadStatus.COMPLETED.

Actual Behavior

The file status remains DownloadStatus.DOWNLOADING indefinitely.

Suspected Code Location

In client.py, inside the download method's try block, line 309 should be DownloadStatus.COMPLETED:

PySUS/pysus/api/client.py

Lines 299 to 315 in c4c643d

if timeout is not None:
with anyio.fail_after(timeout):
await client.download(file, local_path, callback)
else:
await client.download(file, local_path, callback)
await self._update_state(
local_path=local_path,
remote_path=str(remote_path),
client_name=client_name,
status=DownloadStatus.DOWNLOADING,
year=file.year,
month=file.month,
state=file.state,
group=getattr(file.group, "name", None),
)
return await ExtensionFactory.instantiate(local_path)

Proposed Fix

Change status=DownloadStatus.DOWNLOADING to status=DownloadStatus.COMPLETED at the end of the successful download sequence.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions