feat: add audio speech generation endpoint and media handling refactor

mirror of https://github.com/xtekky/gpt4free.git synced 2025-12-05 18:20:35 -08:00

- Added new `/v1/audio/speech` and `/api/{path_provider}/audio/speech` endpoints in `g4f/api/__init__.py` for generating speech from text
- Introduced `AudioSpeechConfig` model in `g4f/api/stubs.py` with fields for input, model, provider, voice, instructions, and response format
- Updated `PollinationsAI.py` to support `modalities` in `kwargs` when checking for audio
- Set default voice for audio models in `PollinationsAI.py` if not provided in `kwargs`
- Added debug print in `PollinationsAI.py` to log request data to text API endpoint
- Extended supported FastAPI response types in `g4f/api/__init__.py` to include `FileResponse` from `starlette.responses`
- Added `BackgroundTask` to clean up generated audio files after serving in `g4f/api/__init__.py`
- Modified `AnyProvider.py` to include `EdgeTTS`, `gTTS`, and `MarkItDown` as audio providers when `audio` is in `kwargs` or `modalities`
- Created `resolve_media` helper in `g4f/client/__init__.py` to standardize media handling for audio/image input
- Replaced manual media preprocessing in `Completions`, `AsyncCompletions`, and `Images` classes with `resolve_media`
- Added `/docs/README.md` with a link to the documentation site

This commit is contained in:

hlohaus

2025-04-26 12:21:49 +02:00

parent b15a83ae13

commit c3632984f7

6 changed files with 90 additions and 29 deletions

1

docs/README.md Normal file

View file

				`@ -0,0 +1 @@`
				`[Documentation](https://gpt4free.github.io/docs/main.html)`

Rows
Columns

feat: add audio speech generation endpoint and media handling refactor

1 docs/README.md Normal file Unescape Escape View file

1

docs/README.md Normal file

View file