Site icon Thunder Bird 37

SQL Join is changing values of my existing column

SQL Join is changing values of my existing column

I’m attempting to use SQL to pull data from a database into a Jupyter (python) notebook and work with it thereI have a query that pulls the yearweek of flight’s upload date, and counts the number of flights in that yearweekFinally, it groups the results by the yearweek of upload date:

SELECT YEARWEEK(d.upload_date), COUNT(f.id)
FROM apps_flight f
LEFT JOIN apps_enginedatafile d ON d.id=f.import_file_id
WHERE f.global_duplicate = 0
GROUP BY YEARWEEK(d.upload_date)

I want to count number of subscribers (located in another table) from each yearweek to compare them to count of flightsSo I’m trying to join said table by adding:

LEFT JOIN apps_subscription s ON s.basesubscription_ptr_id = f.id

But, when I do this, the counts of my flight values change! The first few counts for the original query look like:

[327, 605, 78, 5768, 9716, 9686, 7902, 3699, 3323, 6081, 4966, 3456, 3181, 2749, 4577, 3157, 1792, 1806, …]

After joining the table, it becomes:

[327, 738, 78, 8854, 17418, 16156, 13921, 7536, 5380, 10040, 7559, 5461, 6323, 6412, 6702, 5433, 2924, …]

I’m not sure what’s happening herePerhaps the join is creating duplicate rows? The data set is very large, and takes about 30 minutes to run the queryAdding a LIMIT doesn’t seem to speed it up, so as you can imagine, testing takes a little while(If I’m oblivious to another way to speed up the query aside from a LIMIT, feel free to make me aware)!

Thanks for any info.

Exit mobile version