Processing Queue for "Ask AI" commands

Explain the problem as you see it

Sometimes I have really long texts / notes/ essays that span almost 14 batches that need to be sent... In such cases, I get a flood of these messages:

Followed by:

Why is this a problem for you?

It becomes a problem because the remaining batches that are not able to be processed are literally "lost information". There is a doubling of the delay before retrying which works, but when 10 other prompts are being sent simultaneously, eventually it will hit the maximum number of retries and a lot of the batch will become lost.

Another issue is when I'm running multiple AI generations for different nodes. They end up competing against each other.

So for both "batch prompt processing" and "multiple generations", having too many simultaneous connections end up leading to stopping of the generation.

Suggest a solution

It would be really nice to be able to set a maximum number of simultaneous queries to the openAI API, and be able to then 'queue' it. Having a queue somewhere which is viewable would also be really cool.