mirror of
https://github.com/xtekky/gpt4free.git
synced 2025-12-06 02:30:41 -08:00
feat: add audio generation support for multiple providers
- Added new examples for `client.media.generate` with `PollinationsAI`, `EdgeTTS`, and `Gemini` in `docs/media.md` - Modified `PollinationsAI.py` to default to `default_audio_model` when audio data is present - Adjusted `PollinationsAI.py` to conditionally construct message list from `prompt` when media is being generated - Rearranged `PollinationsAI.py` response handling to yield `save_response_media` after checking for non-JSON content types - Added support in `EdgeTTS.py` to use default values for `language`, `locale`, and `format` from class attributes - Improved voice selection logic in `EdgeTTS.py` to fallback to default locale or language when not explicitly provided - Updated `EdgeTTS.py` to yield `AudioResponse` with `text` field included - Modified `Gemini.py` to support `.ogx` audio generation when `model == "gemini-audio"` or `audio` is passed - Used `format_image_prompt` in `Gemini.py` to create audio prompt and saved audio file using `synthesize` - Appended `AudioResponse` to `Gemini.py` for audio generation flow - Added `save()` method to `Image` class in `stubs.py` to support saving `/media/` files locally - Changed `client/__init__.py` to fallback to `options["text"]` if `alt` is missing in `Images.create` - Ensured `AudioResponse` in `copy_images.py` includes the `text` (prompt) field - Added `Annotated` fallback definition in `api/__init__.py` for compatibility with older Python versions
This commit is contained in:
parent
2f46008228
commit
b68b9ff6be
8 changed files with 90 additions and 34 deletions
|
|
@ -28,6 +28,30 @@ async def main():
|
|||
asyncio.run(main())
|
||||
```
|
||||
|
||||
#### **More examples for Generate Audio:**
|
||||
|
||||
```python
|
||||
from g4f.client import Client
|
||||
|
||||
from g4f.Provider import EdgeTTS, Gemini, PollinationsAI
|
||||
|
||||
client = Client(provider=PollinationsAI)
|
||||
response = client.media.generate("Hello", audio={"voice": "alloy", "format": "mp3"})
|
||||
response.data[0].save("openai.mp3")
|
||||
|
||||
client = Client(provider=PollinationsAI)
|
||||
response = client.media.generate("Hello", model="hypnosis-tracy")
|
||||
response.data[0].save("hypnosis.mp3")
|
||||
|
||||
client = Client(provider=Gemini)
|
||||
response = client.media.generate("Hello", model="gemini-audio")
|
||||
response.data[0].save("gemini.ogx")
|
||||
|
||||
client = Client(provider=EdgeTTS)
|
||||
response = client.media.generate("Hello", audio={"locale": "en-US"})
|
||||
response.data[0].save("edge-tts.mp3")
|
||||
```
|
||||
|
||||
#### **Transcribe an Audio File:**
|
||||
|
||||
Some providers in G4F support audio inputs in chat completions, allowing you to transcribe audio files by instructing the model accordingly. This example demonstrates how to use the `AsyncClient` to transcribe an audio file asynchronously:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue