panda{·}etl (YC W24)
p/panda-etl-yc-w24
Automate your document workflows
Giuseppe Della Corte

panda{·}etl — Automate your document workflows

Featured
168
Turn messy files into actionable data. Upload PDFs, images, audio and websites. Define data points for AI-powered extraction. See results in exportable spreadsheets with linked, highlighted sources. Ask questions, plot charts and draft reports on top.
Replies
Best
Giuseppe Della Corte
👋 Hey Product Hunt community! We're excited to share something we've been working on at Sinaptik (YC W24). After creating pandas-ai (chat with your tables), we've had countless conversations with data analysts and business experts about their daily struggles. These chats revealed some common frustrations: 1. Valuable insights buried in messy, hard-to-read files 2. The headache of managing documents with different permission settings 3. RAG chatbots that seemed promising but ended up being costly data dumps 4. And the ever-present challenge of bridging the gap between business experts and developers These weren't just abstract problems - we saw how they affected real people trying to do their jobs efficiently. So, we rolled up our sleeves and got to work. We talked to analysts, data scientists, and business users to understand their needs. The result is panda{·}etl, a tool we hope will make life easier for anyone dealing with document-heavy workflows. With panda{·}etl, you can: 1. Upload those tricky files (you know, the PDFs, images, and audio files that usually cause headaches) 2. Define exactly what data you need (whether it's ESG metrics, competitors data, market trends, risk engineering reports, insurance claims) 3. Get spreadsheets where you can actually trace where each piece of data came from 4. Easily validate and export your data 5. And use our pandas-ai powered chat to explore your extractions, plot charts and add them to drafts We've built panda{·}etl with flexibility in mind, offering solutions ranging from SaaS to On-Premise: 1. For individuals, we offer a free personal plan with limited documents processed and extractions per month. It's perfect for trying out the tool and for smaller projects. 2. For businesses, we have scalable plans that grow with your needs, files size and amount of docs. 3. For enterprises, we provide custom solutions, including on-premise deployments for those with specific security or compliance requirements. We're still learning and improving, and that's why we're here. We'd love to hear your thoughts, experiences, or even your data horror stories. How do you deal with unstructured data in your work? What solutions have you tried? Let's chat - we're genuinely curious to learn from this community! 🙌
Alexandros Fokianos
@gdc been following you guys for a long time. as a friend, so proud of seeing you guys evolving the product and test out things As a fellow startupper, I think you're on track to solve huge problems, not only for enterprise companies but also for startups and scaleups that manage big amount of data. Great launch!
Gabriele Venturi
@fok96 thanks a lot mate!
Giuseppe Della Corte
@fok96 excited to have Zefi's team support. Let's rock!
Andrew Nelson
@gdc congrats on the launch, Giuseppe! 🙌 looks like a neat product, looking forward to trying it out 😊
Serge Tim
Congrats! Do you have an API? My use case: I want to build a QA system for my CSV tables with survey results (some cells may contain numbers, and some contain text), and it needs to work within my product
Gabriele Venturi
@s5f5f5f thanks a lot! Yes, we also offer an API that is a perfect match for your use case. Feel free to reach out, would love to learn more about your use case: gabriele@sinaptik.ai
Jose Quan
@gabriele_venturi I am also interested in an API, how can we get in touch?
Tony Han
I've tried a few RAG enabled tools, but none of them seem to be effective. Will try this out - looks very promising. I like how it's open source, free (with a limit) and how you can create workflows to automate file processing. Would be cool to see how others are handling files - if they want to share workflows they built! Congrats on the launch @gdc and team!
Gabriele Venturi
@tonyhanded thanks a lot, can’t wait to hear your thoughts 🚀
Daniel Bukač
I love the design and the idea is solid. How would you distinguish yourself from Deepnote's AI features?
Gabriele Venturi
@daniel_bukac thanks a lot for the feedback! We are going to focus more and more on a no code ux, adding more and more pipelines from the community. In theory the goal is that no matter your technical expertise, everyone can run pipelines on panda etl!
Şeyma Alan
Congrats! The product seems very successful. Are you able to extract information from tables in PDFs?
Gabriele Venturi
@seymaalan yes, and soon we are releasing v2 of our pdf parser, which will be even more accurate 😄
Jose Quan
Read in one of the comments that u offer an API… I run a copier dealership with 100s of copier scanners that produce 1000s of PDFs, pretty sure our clients (banks, insurance cos, BPOs, car dealerships, hospitals, etc) would find it useful. How can we get in touch?
Giuseppe Della Corte
@jose_quan1 on our website there is a form where you can book a call
Nicole Park
Congrats on the launch, @gabriele_venturi ! I love that there's also an open-source version. It will be useful across a variety of fields in different industries. Wishing your team great success! - I'll definitely give it a try. :)
Gabriele Venturi
@nicolepark thanks a lot for the kind feedback, we’ll do our best to keep going!
Jonathan Viet Pham
Congrats to the panda.etl team! This tool sounds like a fantastic way to simplify turning unstructured data into actionable insights. Looking forward to seeing how it helps streamline data extraction!
Gabriele Venturi
@vietpham thanks a lot! Can’t wait to hear your feedback if you have a chance to test it out!
Giustino
Congrats on the launch @gdc! I'm really looking forward to trying this. Working with several clients in the past years, I can see how much value they could get from more open access to data! 👏
Leonardo Ubbiali
Panda{·}etl looks useful for turning messy files into structured data. It handles various input types, uses AI for extraction, and provides results in spreadsheets with source links. The added analysis features could save time. However, its real-world performance and pricing would determine its actual value to users.
Gabriele Venturi
@leonardo_ubbiali thanks a lot for the great feedback, really appreciated! We'll keep pushing 🚀 Stay tuned!
Daniel Zhang
This is really interesting, I think the summary of the 4 main issue is exactly it with data scientist/analyst. Especially permission setting, it took us a week to just pass out the right credential and permission for different database access. I've checked out your pricing and I was not too sure how much the 500 credits are going to get me through, if it was in terms of number of files, or sizes of files altogether, how much would you say that is ? Again, congratulation Giuseppe !
Gabriele Venturi
@daniel_xpo the pricing is based on characters or pages, whichever is low. The free plan includes at least 1000 pages per month. Thanks a lot for the great feedback. Looking forward to hear more if you give it a try!
Kaushik Mukherjee
Congratulations on the launch. What is your accuracy as far as numbers go ?
CHEN
Congrats on the launch! Turning messy files into actionable data sounds incredibly useful. How does the AI handle different file formats like PDFs and audio for data extraction? 🔍 @gdc
Filippo Guerranti
I'm really looking forward to using this amazing tool in my daily work. I'm expecting a huge increase in productivity and performance, especially when lots of messy data is being used! Thanks for the amazing job you are putting on your products!
Gabriele Venturi
@filippo_guerranti thanks a lot, can't wait to hear your feedback 🚀
Yanlin Wu
Congrats on the launch! I still have a hard time extracting information from PDFs recently... But with panda{·}etl, I believe my problems can be solved!!! That's brilliant! Can't wait to try it out! Gook look and I look forward to your future update!
Glen Dsouza
Congrats on the launch! This looks really good! Do you plan to add cloud services too in the future?
Gabriele Venturi
@glen_dsouza_ yes we already offer cloud in close beta. Feel free to reach out if interested 😀
Francesco Manicardi
Nice product @gabriele_venturi , can you share a bit more about how it works? Does it extract text from the PDF or does it do OCR? If it extract text strings, how do you deal with tables which can turn out all messed up with newlines and weird formatting?
Gabriele Venturi
@francesco_manicardi great question. It does extract text and it does OCR. We have build a custom parser that for each page identifies the different components of a page (images, text, tables, charts, etc) and parses each individually with the most accurate technique, and parses it to be easier to be understood for LLMs.
Leonardo Vezzati
This is amazing guys!
Tommaso Cardinale
This sounds like a great idea, I hope to see this project come to fruition soon. PDFs are useful but with panda{·}etl they could be powerful. Good luck guys!
Alan Turing
This looks really amazing! Looking forward to try it out! Not only will it save me a lot of time and effort, but it will provide even more insights that will help me it will probably find myself!! Great job 👏
Gabriele Venturi
@alan_turing2 thank you so much Alan, really super appreciated! Looking forward to hear your thoughts as you give it a try!