Victor Mustar PRO
AI & ML interests
Building the UX of this website
Recent Activity
updated a bucket 1 day ago
victor/ace-step-community liked a model 2 days ago
nvidia/Lyra-2.0 updated a Space 2 days ago
victor/3d-horseOrganizations
replied to their post 4 days ago
reacted to asigalov61's post with π₯ 4 days ago
Post
3941
π₯π΅ β πΉ π₯Check out my new large-scale MIDI + Lyrics dataset!!!
asigalov61/Lyrics-MIDI-Dataset
~179k MIDIs with corresponding Lyrics to play with!!! π€
If you liked the dataset, please β€οΈ
Any feedback and/or suggestions are also appreciated π€
asigalov61/Lyrics-MIDI-Dataset
~179k MIDIs with corresponding Lyrics to play with!!! π€
If you liked the dataset, please β€οΈ
Any feedback and/or suggestions are also appreciated π€
reacted to SeaWolf-AI's post with π₯ 5 days ago
Post
5858
𧬠Darwin-27B-Opus: 86.9% on GPQA Diamond β World #5, Zero Training
We are excited to share Darwin-27B-Opus, a 27B model that achieved 86.9% on GPQA Diamond β ranking #5 globally on the HuggingFace leaderboard β without a single gradient update.
How? Darwin breeds pretrained models through evolutionary FFN crossbreeding. The father (Qwen3.5-27B) provides the reasoning architecture; the mother (Claude 4.6 Opus Reasoning Distilled) contributes structured chain-of-thought knowledge. CMA-ES automatically discovers optimal per-layer blending ratios β no human tuning required.
The result surpasses the original Qwen3.5-27B (85.5%), GLM-5.1 (744B, 86.2%), and Qwen3.5-122B (86.6%). A 27B model outperforming 744B β with zero training, zero data, one GPU, ~2 hours.
We also confirmed hybrid vigor on Korean benchmarks: Darwin-27B-KR (2nd generation offspring) surpassed both parents on CLIcK, winning 7 out of 11 categories. The evolutionary optimizer independently assigned 93% of FFN from the Korean-specialized mother while preserving 93% of attention from the reasoning-specialized father β autonomously validating our core principle: FFN carries knowledge, Attention carries reasoning.
π Public release: 10 days β 300+ community derivatives, 120K+ downloads.
π Links:
Darwin-27B-Opus: FINAL-Bench/Darwin-27B-Opus
article: https://huggingface.co/blog/FINAL-Bench/darwin-gpqa
Darwin Family Collection: https://huggingface.co/collections/FINAL-Bench/darwin-family
If foundation models are raw ore, Darwin is the forge. We are just getting started. π₯
We are excited to share Darwin-27B-Opus, a 27B model that achieved 86.9% on GPQA Diamond β ranking #5 globally on the HuggingFace leaderboard β without a single gradient update.
How? Darwin breeds pretrained models through evolutionary FFN crossbreeding. The father (Qwen3.5-27B) provides the reasoning architecture; the mother (Claude 4.6 Opus Reasoning Distilled) contributes structured chain-of-thought knowledge. CMA-ES automatically discovers optimal per-layer blending ratios β no human tuning required.
The result surpasses the original Qwen3.5-27B (85.5%), GLM-5.1 (744B, 86.2%), and Qwen3.5-122B (86.6%). A 27B model outperforming 744B β with zero training, zero data, one GPU, ~2 hours.
We also confirmed hybrid vigor on Korean benchmarks: Darwin-27B-KR (2nd generation offspring) surpassed both parents on CLIcK, winning 7 out of 11 categories. The evolutionary optimizer independently assigned 93% of FFN from the Korean-specialized mother while preserving 93% of attention from the reasoning-specialized father β autonomously validating our core principle: FFN carries knowledge, Attention carries reasoning.
π Public release: 10 days β 300+ community derivatives, 120K+ downloads.
π Links:
Darwin-27B-Opus: FINAL-Bench/Darwin-27B-Opus
article: https://huggingface.co/blog/FINAL-Bench/darwin-gpqa
Darwin Family Collection: https://huggingface.co/collections/FINAL-Bench/darwin-family
If foundation models are raw ore, Darwin is the forge. We are just getting started. π₯
reacted to prithivMLmods's post with β€οΈ 6 days ago
Post
6115
A new comparator on Spaces showcases Standard FLUX.2 Decoder vs. FLUX.2 Small Decoder. The Small Decoder is ~1.4Γ faster, uses ~1.4Γ less VRAM, and maintains near-identical image quality. It has ~28M parameters with narrower channels [96, 192, 384, 384] vs. [128, 256, 512, 512], and the demo supports sequence generation by running both decoders simultaneously and comparing the results side by side.
π€ Comparator: prithivMLmods/Flux.2-4B-Decoder-Comparator
π FLUX.2-small-decoder: black-forest-labs/FLUX.2-small-decoder
π GitHub: https://github.com/PRITHIVSAKTHIUR/Flux.2-4B-Encoder-Comparator
π Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
π€ > App built on the Gradio SDK. To learn more, visit the app page or the respective model pages.
π€ Comparator: prithivMLmods/Flux.2-4B-Decoder-Comparator
π FLUX.2-small-decoder: black-forest-labs/FLUX.2-small-decoder
π GitHub: https://github.com/PRITHIVSAKTHIUR/Flux.2-4B-Encoder-Comparator
π Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
π€ > App built on the Gradio SDK. To learn more, visit the app page or the respective model pages.
posted an update 6 days ago
Post
4668
Want to share my enthusiasm for zai-org/GLM-5.1 here too π₯
I think we have it: our open source Claude Code = GLM-5.1 + Pi (https://pi.dev/) - Built a Three.js racing game to eval and it's extremely impressive. Thoughts:
- One-shot car physics with real drift mechanics (this is hard)
- My fav part: Awesome at self iterating (with no vision!) created 20+ Bun.WebView debugging tools to drive the car programmatically and read game state. Proved a winding bug with vector math without ever seeing the screen
- 531-line racing AI in a single write: 4 personalities, curvature map, racing lines, tactical drifting. Built telemetry tools to compare player vs AI speed curves and data-tuned parameters
- All assets from scratch: 3D models, procedural textures, sky shader, engine sounds, spatial AI audio!
- Can do hard math: proved road normals pointed DOWN via vector cross products, computed track curvature normalized by arc length to tune AI cornering speed
You are going to hear about this model a lot in the next months - open source let's go - and thanks z-aiππ
I think we have it: our open source Claude Code = GLM-5.1 + Pi (https://pi.dev/) - Built a Three.js racing game to eval and it's extremely impressive. Thoughts:
- One-shot car physics with real drift mechanics (this is hard)
- My fav part: Awesome at self iterating (with no vision!) created 20+ Bun.WebView debugging tools to drive the car programmatically and read game state. Proved a winding bug with vector math without ever seeing the screen
- 531-line racing AI in a single write: 4 personalities, curvature map, racing lines, tactical drifting. Built telemetry tools to compare player vs AI speed curves and data-tuned parameters
- All assets from scratch: 3D models, procedural textures, sky shader, engine sounds, spatial AI audio!
- Can do hard math: proved road normals pointed DOWN via vector cross products, computed track curvature normalized by arc length to tune AI cornering speed
You are going to hear about this model a lot in the next months - open source let's go - and thanks z-aiππ
reacted to Juanxi's post with π₯ 8 days ago
Post
4382
π’ Awesome Multimodal Modeling
We introduce Awesome Multimodal Modeling, a curated repository tracing the architectural evolution of multimodal intelligenceβfrom foundational fusion to native omni-models.
πΉ Taxonomy & Evolution:
Traditional Multimodal Learning β Foundational work on representation, fusion, and alignment.
Multimodal LLMs (MLLMs) β Architectures connecting vision encoders to LLMs for understanding.
Unified Multimodal Models (UMMs) β Models unifying Understanding + Generation via Diffusion, Autoregressive, or Hybrid paradigms.
Native Multimodal Models (NMMs) β Models trained from scratch on all modalities; contrasts early vs. late fusion under scaling laws.
π‘ Key Distinction:
UMMs unify tasks via generation heads; NMMs enforce interleaving through joint pre-training.
π Explore & Contribute: https://github.com/OpenEnvision/Awesome-Multimodal-Modeling
We introduce Awesome Multimodal Modeling, a curated repository tracing the architectural evolution of multimodal intelligenceβfrom foundational fusion to native omni-models.
πΉ Taxonomy & Evolution:
Traditional Multimodal Learning β Foundational work on representation, fusion, and alignment.
Multimodal LLMs (MLLMs) β Architectures connecting vision encoders to LLMs for understanding.
Unified Multimodal Models (UMMs) β Models unifying Understanding + Generation via Diffusion, Autoregressive, or Hybrid paradigms.
Native Multimodal Models (NMMs) β Models trained from scratch on all modalities; contrasts early vs. late fusion under scaling laws.
π‘ Key Distinction:
UMMs unify tasks via generation heads; NMMs enforce interleaving through joint pre-training.
π Explore & Contribute: https://github.com/OpenEnvision/Awesome-Multimodal-Modeling
reacted to qgallouedec's post with π₯ 18 days ago
Post
2309
TRL v1.0 is out!
Hugging Face's TRL library is downloaded 3 million times a month. Over 130k models trained with it are public on the Hub, and major projects like @unsloth and @axolotl-ai-co build directly on top of it. v1.0 is the moment we acknowledged that responsibility explicitly, with a real stability contract.
The field hasn't settled. Building stable software in a domain that keeps invalidating its own assumptions is the actual problem we're solving. The answer is a design that can absorb the next shift without breaking what people rely on.
What's in v1.0:
Deep Hugging Face integration, low infrastructure burden
What's next: asynchronous GRPO, better scaling support, and making training legible enough that agents can inspect and steer it.
Read more: hf.co/blog/trl-v1
Hugging Face's TRL library is downloaded 3 million times a month. Over 130k models trained with it are public on the Hub, and major projects like @unsloth and @axolotl-ai-co build directly on top of it. v1.0 is the moment we acknowledged that responsibility explicitly, with a real stability contract.
The field hasn't settled. Building stable software in a domain that keeps invalidating its own assumptions is the actual problem we're solving. The answer is a design that can absorb the next shift without breaking what people rely on.
What's in v1.0:
Deep Hugging Face integration, low infrastructure burden
What's next: asynchronous GRPO, better scaling support, and making training legible enough that agents can inspect and steer it.
pip install --upgrade trlRead more: hf.co/blog/trl-v1
reacted to MikeDoes's post with π 19 days ago
Post
614
Ai4Privacy has been working on this for the past year. π
Today we're releasing the PII Masking 2M Series, the world's largest open source privacy masking dataset. (Again. ππ)
π’ 2M+ synthetic examples
π 32 locales across Europe
π·οΈ 98 entity types
π₯π¬π¦πΌπ 5 industry verticals: Health, Finance, Digital, Work, Location
β 1M+ entries freely available on Hugging Face
Every example is 100% synthetic. No real personal data. Built so you can train and evaluate PII detection models without the legal headaches. π
Thank you for 15,000,000+ downloads across our datasets, models, and libraries. This one's for you. β€οΈ
hashtag#privacy hashtag#ai hashtag#opensource hashtag#nlp hashtag#gdpr hashtag#pii hashtag#huggingface hashtag#machinelearning
Today we're releasing the PII Masking 2M Series, the world's largest open source privacy masking dataset. (Again. ππ)
π’ 2M+ synthetic examples
π 32 locales across Europe
π·οΈ 98 entity types
π₯π¬π¦πΌπ 5 industry verticals: Health, Finance, Digital, Work, Location
β 1M+ entries freely available on Hugging Face
Every example is 100% synthetic. No real personal data. Built so you can train and evaluate PII detection models without the legal headaches. π
Thank you for 15,000,000+ downloads across our datasets, models, and libraries. This one's for you. β€οΈ
hashtag#privacy hashtag#ai hashtag#opensource hashtag#nlp hashtag#gdpr hashtag#pii hashtag#huggingface hashtag#machinelearning
reacted to unmodeled-tyler's post with π 23 days ago
Post
1971
Hey Hugging Face!
PRODUCT HUNT LINK: https://www.producthunt.com/products/quanta-intellect?utm_source=other&utm_medium=social
I've been sharing my new AI browser Vessel the last few days and I've gotten some great feedback/interest from a lot of you!
I'm excited to announce that Vessel Browser is now live on Product Hunt! If this is the first you've heard of it, check it out! Vessel is an open source AI browser built specifically for agents on Linux. It's not a fork of an existing browser, and it doesn't assume that the human is the primary operator.
If you've already tried Vessel Browser, feel free to leave a review on Product Hunt of what you thought - or if you'd prefer, send me an email directly or reach out on twitter if you have any questions about it. I'm perpetually online & happy to chat π
I'm committed to building the best open source AI browser out there, and Vessel is only going to improve as time goes on!
PRODUCT HUNT LINK: https://www.producthunt.com/products/quanta-intellect?utm_source=other&utm_medium=social
I've been sharing my new AI browser Vessel the last few days and I've gotten some great feedback/interest from a lot of you!
I'm excited to announce that Vessel Browser is now live on Product Hunt! If this is the first you've heard of it, check it out! Vessel is an open source AI browser built specifically for agents on Linux. It's not a fork of an existing browser, and it doesn't assume that the human is the primary operator.
If you've already tried Vessel Browser, feel free to leave a review on Product Hunt of what you thought - or if you'd prefer, send me an email directly or reach out on twitter if you have any questions about it. I'm perpetually online & happy to chat π
I'm committed to building the best open source AI browser out there, and Vessel is only going to improve as time goes on!
reacted to prabhatkr's post with π 23 days ago
Post
1297
π Is Vector RAG Dead? Why We Built FastMemory to Beat PageIndex
If you've built a RAG pipeline for complex financial documents, you already know the painful truth: Standard vector search fails when things get complicated.
While tools like PageIndex and Mafin 2.5 provide great out-of-the-box PDF chat experiences, they hit structural bottlenecks the second you push them past basic queries.
We just published a comprehensive benchmark study comparing FastMemory against PageIndex across 5 advanced datasets. The results fundamentally change how we should think about document ingestion.
Read more: https://x.com/FastBuilderAI/status/2037404008978018493
If you've built a RAG pipeline for complex financial documents, you already know the painful truth: Standard vector search fails when things get complicated.
While tools like PageIndex and Mafin 2.5 provide great out-of-the-box PDF chat experiences, they hit structural bottlenecks the second you push them past basic queries.
We just published a comprehensive benchmark study comparing FastMemory against PageIndex across 5 advanced datasets. The results fundamentally change how we should think about document ingestion.
Read more: https://x.com/FastBuilderAI/status/2037404008978018493
reacted to prometechinc's post with π 23 days ago
Post
2293
CicikuΕ v4-5B (POFUDUK Edition) is a next-generation compact language model engineered for high-efficiency reasoning, adaptive intelligence, and behavioral coherence. Built on the Gemma 4B IT foundation and enhanced through advanced LoRA optimization and selective layer reconstruction, this model delivers powerful performance without the overhead of massive parameter counts.
π Explore the model: pthinc/pofuduk_cicikus_v4_5B
π§ Why CicikuΕ?
In a world dominated by massive LLMs, CicikuΕ takes a different path:
β‘ Fast & Efficient β Designed for edge deployment and low-resource environments
π― High Reasoning Accuracy β Strong results across MMLU, GSM8K, HumanEval, and more
π§© Behavior-Aware Intelligence β Powered by the Behavioral Consciousness Engine (BCE)
π Low Hallucination Rate β ~3% with built-in ethical filtering
π Multilingual Capable β Optimized for English and Turkish
π Explore the model: pthinc/pofuduk_cicikus_v4_5B
π§ Why CicikuΕ?
In a world dominated by massive LLMs, CicikuΕ takes a different path:
β‘ Fast & Efficient β Designed for edge deployment and low-resource environments
π― High Reasoning Accuracy β Strong results across MMLU, GSM8K, HumanEval, and more
π§© Behavior-Aware Intelligence β Powered by the Behavioral Consciousness Engine (BCE)
π Low Hallucination Rate β ~3% with built-in ethical filtering
π Multilingual Capable β Optimized for English and Turkish
reacted to prithivMLmods's post with π₯ 23 days ago
Post
5296
Flux-Klein-KV-Edit-Consistency demo is now available on Spaces. It preserves character identity and delivers high-quality, realistic results after edits. No need for any special prompts, just upload the image, type your prompt, and get the resulting image blazing fast.
π₯ Demo Space: prithivMLmods/flux-klein-kv-edit-consistency
π€ Model: black-forest-labs/FLUX.2-klein-9b-kv
π€ Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
π Gradio Server Mode: https://www.gradio.app/main/guides/server-mode
β Built with Headless Gradio, an alternative to using gr.Blocks for creating the frontend and triggering events, powered by FastAPI + Gradio. You can now design the frontend however you want, with continued support for APIs, MCP, and ZeroGPU.
β Gradio Server Mode is now available from gradio@v6.10.0.
To learn more, visit the app page or the respective model pages.
π₯ Demo Space: prithivMLmods/flux-klein-kv-edit-consistency
π€ Model: black-forest-labs/FLUX.2-klein-9b-kv
π€ Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
π Gradio Server Mode: https://www.gradio.app/main/guides/server-mode
β Built with Headless Gradio, an alternative to using gr.Blocks for creating the frontend and triggering events, powered by FastAPI + Gradio. You can now design the frontend however you want, with continued support for APIs, MCP, and ZeroGPU.
β Gradio Server Mode is now available from gradio@v6.10.0.
To learn more, visit the app page or the respective model pages.
reacted to mayafree's post with π about 2 months ago
Post
4392
I built a Space that lets you switch between all three Qwen3.5 official collection models in a single interface.
MAYA-AI/QWEN-3_5-CHAT
The architecture is the key part. Instead of using Gradio as the UI, I use it purely as an API engine. FastAPI serves a fully custom HTML/JS frontend that calls /gradio_api/call/chat via SSE streaming. No DOM conflicts, no layout constraints.
Four main features: instant model switching with automatic spec adjustment (max tokens, temperature ceiling, Vision availability all update per model), Thinking Mode via /think prefix with collapsible reasoning chain, Vision image upload via base64 conversion, and HF OAuth implemented directly at the FastAPI level.
For model selection: 122B-A10B with Thinking Mode for math, logic, and agents. 27B for writing, translation, and instruction following. 35B-A3B for fast everyday questions.
A few surprises during development β Gradio 6.x removed several parameters quietly, base64 image strings broke gr.Image(type="pil") so I switched to gr.Textbox with backend PIL conversion, and Thinking Mode parsing needed a full rewrite with indexOf instead of regex.
Thanks to the Qwen team for making this possible. Try it out and let me know what you think.
#Qwen3 #Qwen35 #OpenSourceAI #HuggingFace #LLM #ThinkingAI #vidraft #MultimodalAI
MAYA-AI/QWEN-3_5-CHAT
The architecture is the key part. Instead of using Gradio as the UI, I use it purely as an API engine. FastAPI serves a fully custom HTML/JS frontend that calls /gradio_api/call/chat via SSE streaming. No DOM conflicts, no layout constraints.
Four main features: instant model switching with automatic spec adjustment (max tokens, temperature ceiling, Vision availability all update per model), Thinking Mode via /think prefix with collapsible reasoning chain, Vision image upload via base64 conversion, and HF OAuth implemented directly at the FastAPI level.
For model selection: 122B-A10B with Thinking Mode for math, logic, and agents. 27B for writing, translation, and instruction following. 35B-A3B for fast everyday questions.
A few surprises during development β Gradio 6.x removed several parameters quietly, base64 image strings broke gr.Image(type="pil") so I switched to gr.Textbox with backend PIL conversion, and Thinking Mode parsing needed a full rewrite with indexOf instead of regex.
Thanks to the Qwen team for making this possible. Try it out and let me know what you think.
#Qwen3 #Qwen35 #OpenSourceAI #HuggingFace #LLM #ThinkingAI #vidraft #MultimodalAI
reacted to SeaWolf-AI's post with π₯ about 2 months ago
Post
5019
ALL Bench β Global AI Model Unified Leaderboard
FINAL-Bench/all-bench-leaderboard
If you've ever tried to compare GPT-5.2 and Claude Opus 4.6 side by side, you've probably hit the same wall: the official Hugging Face leaderboard only tracks open-source models, so the most widely used AI systems simply aren't there. ALL Bench fixes that by bringing closed-source models, open-weight models, and β uniquely β all four teams under South Korea's national sovereign AI program into a single leaderboard. Thirty-one frontier models, one consistent scoring scale.
Scoring works differently here too. Most leaderboards skip benchmarks a model hasn't submitted, which lets models game their ranking by withholding results. ALL Bench treats every missing entry as zero and divides by ten, so there's no advantage in hiding your weak spots.
The ten core benchmarks span reasoning (GPQA Diamond, AIME 2025, HLE, ARC-AGI-2), coding (SWE-bench Verified, LiveCodeBench), and instruction-following (IFEval, BFCL). The standout is FINAL Bench β the world's only benchmark measuring whether a model can catch and correct its own mistakes. It reached rank five in global dataset popularity on Hugging Face in February 2026 and has been covered by Seoul Shinmun, Asia Economy, IT Chosun, and Behind.
Nine interactive charts let you explore everything from composite score rankings and a full heatmap to an open-vs-closed scatter plot. Operational metrics like context window, output speed, and pricing are included alongside benchmark scores.
All data is sourced from Artificial Analysis Intelligence Index v4.0, arXiv technical reports, Chatbot Arena ELO ratings, and the Korean Ministry of Science and ICT's official evaluation results. Updates monthly.
FINAL-Bench/all-bench-leaderboard
If you've ever tried to compare GPT-5.2 and Claude Opus 4.6 side by side, you've probably hit the same wall: the official Hugging Face leaderboard only tracks open-source models, so the most widely used AI systems simply aren't there. ALL Bench fixes that by bringing closed-source models, open-weight models, and β uniquely β all four teams under South Korea's national sovereign AI program into a single leaderboard. Thirty-one frontier models, one consistent scoring scale.
Scoring works differently here too. Most leaderboards skip benchmarks a model hasn't submitted, which lets models game their ranking by withholding results. ALL Bench treats every missing entry as zero and divides by ten, so there's no advantage in hiding your weak spots.
The ten core benchmarks span reasoning (GPQA Diamond, AIME 2025, HLE, ARC-AGI-2), coding (SWE-bench Verified, LiveCodeBench), and instruction-following (IFEval, BFCL). The standout is FINAL Bench β the world's only benchmark measuring whether a model can catch and correct its own mistakes. It reached rank five in global dataset popularity on Hugging Face in February 2026 and has been covered by Seoul Shinmun, Asia Economy, IT Chosun, and Behind.
Nine interactive charts let you explore everything from composite score rankings and a full heatmap to an open-vs-closed scatter plot. Operational metrics like context window, output speed, and pricing are included alongside benchmark scores.
All data is sourced from Artificial Analysis Intelligence Index v4.0, arXiv technical reports, Chatbot Arena ELO ratings, and the Korean Ministry of Science and ICT's official evaluation results. Updates monthly.
reacted to MonsterMMORPG's post with π about 2 months ago
Post
2244
SECourses Ultimate Video and Image Upscaler Pro is now V2.1 and massive improvements has arrived
Check all below screenshots to see all amazing features
20 Feburary 2026 Update V2.1
This is a pretty big update
We have 100% changed the FlashVSR+ backend to a new repo and I have significantly upgraded this repo
The new FlashVSR+ works amazing and I think it is better than SeedVR2 for high res videos upscale like upscaling 720p into higher resolution
Top menu navigation bar updated into a better version and view
FlashVSR+ tab remade and all the features are now working
For lower VRAM a button is added which you can use if you get OOM
Read the updated UI to understand how to use
FlashVSR+ now can upscale images very well as well
Image Based GAN upscalers tab also improved and some bugs fixed
Output & Comparison tab Video Output was not working properly and this issue fix fixed
In Output & Comparison tab, new multi video and multi image comparison sliders added which is super useful to quickly compare multiple videos and images
Lots of various bug fixes made
App is getting closer to be perfect please heavily test it and let me know errors and what features you request
This update was mostly about improving the FlashVSR+ since it is a very fast and amazing video upscaler model
Image Based - Gan upscale now can upscale videos perfectly fine and Batch Size (Frames per Iteration) is now working to speed up upscaling videos
For updating, get the latest zip file, extract and overwrite all files and run Windows_Run_SECourses_Upscaler_Pro.bat file
Check all below screenshots to see all amazing features
20 Feburary 2026 Update V2.1
This is a pretty big update
We have 100% changed the FlashVSR+ backend to a new repo and I have significantly upgraded this repo
The new FlashVSR+ works amazing and I think it is better than SeedVR2 for high res videos upscale like upscaling 720p into higher resolution
Top menu navigation bar updated into a better version and view
FlashVSR+ tab remade and all the features are now working
For lower VRAM a button is added which you can use if you get OOM
Read the updated UI to understand how to use
FlashVSR+ now can upscale images very well as well
Image Based GAN upscalers tab also improved and some bugs fixed
Output & Comparison tab Video Output was not working properly and this issue fix fixed
In Output & Comparison tab, new multi video and multi image comparison sliders added which is super useful to quickly compare multiple videos and images
Lots of various bug fixes made
App is getting closer to be perfect please heavily test it and let me know errors and what features you request
This update was mostly about improving the FlashVSR+ since it is a very fast and amazing video upscaler model
Image Based - Gan upscale now can upscale videos perfectly fine and Batch Size (Frames per Iteration) is now working to speed up upscaling videos
For updating, get the latest zip file, extract and overwrite all files and run Windows_Run_SECourses_Upscaler_Pro.bat file
reacted to Ujjwal-Tyagi's post with π₯ 2 months ago
reacted to krisbailey's post with π 2 months ago
Post
485
While doing various projects I kept running into situations where I wanted to be able to have representative samples of some of the current large SOTA datasets that were smaller so I didn't need to worry about slicing or anything else at runtime. So, I created sub datasets making sure to keep the same ratios of data sources. Each dataset card provides info for what's in it.
100M token datasets:
RedPajama v2 100M
Falcon RefinedWeb 100M
Cosmopedia 100M
1B token datasets:
Fineweb-edu 1B
RedPajama v1 1B
RedPajama v2 1B (use this one)
Cosmopedia 1B
10B token datasets:
RedPajama v1 10B
Cosmopedia 10B
Collection here:
https://huggingface.co/collections/krisbailey/bite-size-data
100M token datasets:
RedPajama v2 100M
Falcon RefinedWeb 100M
Cosmopedia 100M
1B token datasets:
Fineweb-edu 1B
RedPajama v1 1B
RedPajama v2 1B (use this one)
Cosmopedia 1B
10B token datasets:
RedPajama v1 10B
Cosmopedia 10B
Collection here:
https://huggingface.co/collections/krisbailey/bite-size-data