Talk with ChatGPT using Python


 This code is a voice-controlled AI chatbot. It uses the SpeechRecognition library to listen to the user's voice, convert it to text and send the text as a prompt to the OpenAI API to get a response. The API uses the text-DaVinci-003 model to generate a response. The response is then passed to the "speak" function to be played as a voice using pygame. The code runs in a loop, so the user can keep asking questions and getting answers.

>> In the previous video, we covered these two function

Add OpenAI code to these two functions and you can make Chat GPT Speak! 

# Here is a basic example: 

This code is a simple implementation of OpenAI's GPT-3 language model in python. The code imports the openai module and a config module containing the API key for accessing OpenAI's API.

In the while loop, the code prompts the user to input a "Question" and uses the "openai.Completion.create" method to generate a response. The method takes several arguments:

  • "model": The name of the GPT-3 language model to use.
  • "prompt": The user's question, is passed as the input to the model.
  • "Temperature": Controls the randomness of the model's responses.
  • "max_tokens": The maximum number of tokens (words or word-like units) to generate.
  • "top_p": The fraction of the total distribution of probabilities to consider when generating responses.
  • "frequency_penalty": A penalty applied to the probabilities of frequently generated tokens.
  • "presence_penalty": A penalty applied to tokens already present in the prompt.
  • "stop": A list of sequences that, if encountered in the generated text, will cause the generation to stop.

The response generated by the method is a dictionary containing the model's generated text. The code retrieves the first generated response and assigns it to the "text" variable. The generated response is then printed to the console, preceded by the string "Reply: ". The while loop allows the code to repeatedly generate responses to new questions until stopped.

#Example Code

  • import openai
  • import config
  • openai.api_key = config.Api

  • while True:
  • ask = input('Question: ')
  • response = openai.Completion.create(
  • model="text-davinci-003",
  • prompt=ask,
  • temperature=0.9,
  • max_tokens=150,
  • top_p=1,
  • frequency_penalty=0,
  • presence_penalty=0.6,
  • stop=[" Human:", " AI:"]
  • )

  • text = response['choices'][0]['text']
  • print('Reply: ' + text)

Modification is needed in the speak function!

This code takes a string of text "data" and splits it into smaller chunks of size 100 words. The chunks are then joined back into strings using the join method. A system command is then created using f-string formatting and run using the os.system method. 

#Modefied speak() function V2

  • chunks = data.split(): This line splits the input text "data" into a list of words by calling the "split" method on "data"
  • chunk_size = 100: This line sets the value of the "chunk_size" variable to 100. This will be used to determine the number of words in each chunk of text.
  • chunks = [chunks[i:i + chunk_size] for i in range(0, len(chunks), chunk_size)]: This line splits the list of words "chunks" into smaller lists of size "chunk_size" (100 words) using a list comprehension. The result is a list of sublists, each containing "chunk_size" words from the original list.
  • text = ' '.join(chunk): This line joins the words in each chunk of text back into a single string using the join method and the space character as a separator. The resulting string is assigned to the "text" variable.

#Modefied speak() V2

  • def speak(data):
  • voice1 = "en-GB-SoniaNeural"
  • filename = "data.mp3"

  • # Split the input text into chunks
  • chunks = data.split()
  • chunk_size = 100
  • chunks = [chunks[i:i + chunk_size] for i in range(0, len(chunks), chunk_size)]

  • # Convert and play each chunk
  • for chunk in chunks:
  • text = ' '.join(chunk)
  • command1 = f'edge-tts --voice "{voice5}" --text "{text}" --write-media "{filename}"'
  • os.system(command1)

  • pygame.init()
  • pygame.mixer.init()

  • try:

  • while
  • pygame.time.Clock().tick(10)

  • except Exception as e:
  • print(e)
  • finally:
  • pygame.mixer.quit()
  • return True

Q. Why split data into chunks?

The data is split into chunks to ensure that it does not exceed the limit of 100/50 words for text-to-speech conversion. The text-to-speech library may have a limit on the length of input text, so the data is divided into smaller chunks to avoid reaching this limit. Additionally, converting a large amount of text at once may cause performance issues, so dividing the data into smaller chunks can improve the performance of the text-to-speech conversion.

#Full Code: Talk with ChatGPT!

>> In the previous video, we covered these two function

  1. import openai
  2. import os
  3. import pygame
  4. import config
  5. import speech_recognition as sr
  6. # pip install SpeechRecognition

  7. voice = "en-US-ChristopherNeural"
  8. openai.api_key = config.Api

  9. def takeCommand():
  10. r = sr.Recognizer()
  11. with sr.Microphone() as source:
  12. print("Listening...")
  13. r.pause_threshold = 1
  14. audio = r.listen(source)

  15. try:
  16. print("Recognising...")
  17. query = r.recognize_google(audio, language='en-us')
  18. except Exception as e:

  19. print()
  20. return "---"
  21. return query

  22. def speak(data):
  23. voice1 = "en-GB-SoniaNeural"

  24. # Split the input text into chunks
  25. chunks = data.split()
  26. chunk_size = 100
  27. chunks = [chunks[i:i + chunk_size] for i in range(0, len(chunks), chunk_size)]

  28. # Convert and play each chunk
  29. for chunk in chunks:
  30. text = ' '.join(chunk)
  31. command1 = f'edge-tts --voice "{voice}" --text "{text}" --write-media "data.mp3"'
  32. os.system(command1)

  33. pygame.init()
  34. pygame.mixer.init()

  36. try:

  38. while
  39. pygame.time.Clock().tick(10)

  40. except Exception as e:
  41. print(e)
  42. finally:
  44. pygame.mixer.quit()
  45. return True

  46. while True:
  47. query = takeCommand().lower()
  48. print(query)

  49. # ask = input('Question: ')
  50. response = openai.Completion.create(
  51. model="text-davinci-003",
  52. prompt=query,
  53. temperature=0.9,
  54. max_tokens=150,
  55. top_p=1,
  56. frequency_penalty=0,
  57. presence_penalty=0.6,
  58. stop=[" Human:", " AI:"]
  59. )

  60. data = response['choices'][0]['text']
  61. print(data)
  62. speak(data)


This code is an implementation of a voice-controlled AI chatbot. It uses various libraries and technologies to perform its functions. Let's go through the code line by line:

1. Import statements: The code imports the following libraries:

  • openai: OpenAI API library
  • os: Operating System library
  • pygame: A library used to play sounds
  • config: A configuration file that contains the API key for OpenAI
  • speech_recognition: A library used to perform speech recognition

2. Voice Variable: The variable "voice" is a string that represents the voice used to play the response generated by the OpenAI API. The value of the string is set to "en-US-ChristopherNeural".

3. OpenAI API Key: The API key is set using the following line of code:

  • openai.api_key = "API-KEY"

4. takeCommand Function: This function is used to listen to the user's voice and convert it to text. It performs the following steps:

  • Initialize a Recognizer object from the speech_recognition library
  • Open the default microphone using the "with" statement and start listening to the audio
  • Use the recognize_google() method from the speech_recognition library to recognize the speech and convert it to text
  • If any exception occurs, return "---"
  • Return the recognized text

5. speak Function: This function is used to play the response generated by the OpenAI API. It performs the following steps:

  • Split the input text into chunks
  • For each chunk, use the "os" library to run a shell command that converts the text to an audio file using the edge-tts library. The command used is:
  • command1 = f'edge-tts --voice "{voice}" --text "{text}" --write-media "data.mp3"'

Use the pygame library to play the audio file "data.mp3". The following steps are performed:

  • Initialize pygame and its mixer module
  • Load the audio file using the music.load() method from the pygame mixer module
  • Play the audio using the method from the pygame mixer module
  • Check if the audio is still playing using the get_busy() method from the pygame mixer module. If it is, wait for 10 milliseconds using the tick() method from the pygame time module.
  • If any exception occurs, print the exception.
  • Finally, stop the audio using the music.stop() method from the pygame mixer module and quit pygame using the quit() method.

5. Main Loop: The code runs in a loop. It performs the following steps:

  • Call the takeCommand() function to get the text representation of the user's voice
  • Convert the text to lowercase using the lower() method
  • Print the text
  • Send the text to the OpenAI API using the Completion.create() method from the openai library. The following parameters are passed:
  • model: The text-DaVinci-003 model is used
  • prompt: The user's text is passed as the prompt
  • temperature: The value of temperature is set to 0.9
  • max_tokens: The maximum number of tokens to generate is set to 150
  • top_p: The value of top_p is

# If you face any problem!

Please raise a query about the error you are facing, and our debugging community will try to assist you. You can also contribute to the community by helping others.

Post a Comment

If you have any doubts, please let me know

Previous Post Next Post