But how can we then ensure that I am not adding/processing products which are already in the "final" table, when I have no knowledge about ALL the products which are in this final table?
Without knowledge about your schema, I don't know enough to answer this. However, the database doesn't need to scan all rows in a table to check if a value exists if you can build an index on the relevant columns. If your products have some unique ID (or tuple of columns), then you can usually build an index on those values, which means the DB builds what is basically a lookup table for those indexed columns.
Without going into too much detail, you can think of an index as a way for a DB to make a "contains" (or "retrieve") operation drop from O(n) (check all rows) to some much faster speed like O(log n) for example. The tradeoff is that you need more space for the index now.
This comes with an added benefit that uniqueness constraints can be easily enforced on indexed columns if needed. And yes, your PK is indexed by default.
Read more about index in Postgres's docs. It actually has pretty readable documentation from my experience. Or read a book on indexes, or a video, etc. The concept is universal.
May you elaborate what you mean with read replicas? Storage in memory?
This highly depends on your needs. I'll link PG's docs on replication though.
If you're migrating right now, I wouldn't think about this too much. Replicas basically are duplicates of your database hosted on different servers (ideally in different warehouses, or even different regions if possible). Replicas work together to stay in sync, but depending on the kind of replica and the kind of query, any replica may be able to handle an incoming query (rather than a single central database).
If all you need are backups though, then replicas could be overkill. Either way, you definitely don't want prod data all stored in a single machine, usually. I would talk to your management about backup requirements and potentially availability/uptime requirements.
Thank you for giving us a great example of how to appropriately use AI: turning a long comment with no line breaks into a blog post summarizing the comment.
Now I just need to pass your comment into ChatGPT to get a short summary.
Edit: I asked for a one-sentence summary of it and this is what I got: