Wednesday, March 18, 2015

html5ever project update: one year!

I started the html5ever project just a bit over one year ago. We adopted the current name last July.

<kmc> maybe the whole project needs a better name, idk
<Ms2ger> htmlparser, perhaps
<jdm> tagsoup
<Ms2ger> UglySoup
<Ms2ger> Since BeautifulSoup is already taken
<jdm> html5ever
<Ms2ger> No
<jdm> you just hate good ideas
<pcwalton> kmc: if you don't call it html5ever that will be a massive missed opportunity

By that point we already had a few contributors. Now we have 469 commits from 18 people, which is just amazing. Thank you to everyone who helped with the project. Over the past year we've upgraded Rust almost 50 times; I'm extremely grateful to the community members who had a turn at this Sisyphean task.

Several people have also contributed major enhancements. For example:

  • Clark Gaebel implemented zero-copy parsing. I'm in the process of reviewing this code and will be landing pieces of it in the next few weeks.

  • Josh Matthews made it possible to suspend and resume parsing from the tree sink. Servo needs this to do async resource fetching for external <script>s of the old-school (non-async/defer) variety.

  • Chris Paris implemented fragment parsing and improved serialization. This means Servo can use html5ever not only for parsing whole documents, but also for the innerHTML/outerHTML getters and setters within the DOM.

  • Adam Roben brought us dramatically closer to spec conformance. Aside from foreign (XML) content and <template>, we pass 99.6% of the html5lib tokenizer and tree builder tests! Adam also improved the build and test infrastructure in a number of ways.

I'd also like to thank Simon Sapin for doing the initial review of my code, and finding a few bugs in the process.

html5ever makes heavy use of Rust's metaprogramming features. It's been something of a wild ride, and we've collaborated with the Rust team in a number of ways. Felix Klock came through in a big way when a Rust upgrade broke the entire tree builder. Lately, I've been working on improvements to Rust's macro system ahead of the 1.0 release, based in part on my experience with html5ever.

Even with the early-adopter pains, the use of metaprogramming was absolutely worth it. Most of the spec-conformance patches were only a few lines, because our encoding of parser rules is so close to what's written in the spec. This is especially valuable with a "living standard" like HTML.

The future

Two upcoming enhancements are a high priority for Web compatibility in Servo:

  • Character encoding detection and conversion. This will build on the zero-copy UTF-8 parsing mentioned above. Non-UTF-8 content (~15% of the Web) will have "one-copy parsing" after a conversion to UTF-8. This keeps the parser itself lean and mean.

  • document.write support. This API can insert arbitrary UTF-16 code units (which might not even be valid Unicode) in the middle of the UTF-8 stream. To handle this, we might switch to WTF-8. Along with document.write we'll start to do speculative parsing.

It's likely that I'll work on one or both of these in the next quarter.

Servo may get SVG support in the near future, thanks to canvg. SVG nodes can be embedded in HTML or loaded from an external XML file. To support the first case, html5ever needs to implement WHATWG's rules for parsing foreign content in HTML. To handle external SVG we could use a proper XML parser, or we could extend html5ever to support "XML5", an error-tolerant XML syntax similar to WHATWG HTML. Ygg01 made some progress towards implementing XML5. Servo would most likely use it for XHTML as well.

Improved performance is always a goal. html5ever describes itself as "high-performance" but does not have specific comparisons to other HTML parsers. I'd like to fix that in the near future. Zero-copy parsing will be a substantial improvement, once some performance issues in Rust get fixed. I'd like to revisit SSE-accelerated parsing as well.

I'd also like to support html5ever on some stable Rust 1.x version, although it probably won't happen for 1.0.0. The main obstacle here is procedural macros. Erick Tryzelaar has done some great work recently with syntex, aster, and quasi. Switching to this ecosystem will get us close to 1.x compatibility and will clean up the macro code quite a bit. I'll be working with Erick to use html5ever as an early validation of his approach.

Simon has extracted Servo's CSS selector matching engine as a stand-alone library. Combined with html5ever this provides most of the groundwork for a full-featured HTML manipulation library.

The C API for html5ever still builds, thanks to continuous integration. But it's not complete or well-tested. With the removal of Rust's runtime, maintaining the C API does not restrict the kind of code we can write in other parts of the parser. All we need now is to complete the C API and write tests. This would be a great thing for a community member to work on. Then we can write bindings for every language under the sun and bring fast, correct, memory-safe HTML parsing to the masses :)

102 comments:

  1. AE Clicks is a leading Mobile App Development in Duabai building agile and robust mobile application i.e. both on Android and iOS platforms.

    ReplyDelete
  2. If you have any questions regarding canon wireless printer setup
    problem or if you are still experiencing some annoying printer problems then just call us +1 800-684-5649

    ReplyDelete
  3. اشكال بلاط الحوش تراكوتا من الانواع المميزة
    شركة نقل اثاث بابها
    شركة تنظيف بحفر الباطن

    كم يكلف تبليط الحوش ؟ الحمامات من الاماكن التي يجب ان تكون نظيفة دائماً و احيانا نظافة الحمامات تعتمد علي عدم توازن البلاط و بالتالي تقف المياه ولا تسير في اتجاه الصرف و تلك الاخطاء تكن بسبب الالتوائات الموجودة في البلاط و السبب الاساسي هي الصناعة، يجب ايها الاخ القارئ ان تنفق ما في جيبك من اجل اختيار افضل انواع البلاط من حيث جودة البلاطة الاساسية في الصنع و من حيث الالوان.شركة تسليك مجارى بالمدينة المنورة
    شركة تنظيف بحفر الباطن

    المواصفات التي يمتلكها البلاط و السيراميك الذي يتواجد لدينا

    ReplyDelete
  4. Thank you for sharing your thoughts. I really appreciate
    your efforts and I am waiting for your further write ups thanks once again.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. I rarely share my story with people, not only because it put me at the lowest point ever but because it made me a person of ridicule among family and friends. I put all I had into Binary Options ($690,000) after hearing great testimonies about this new investment

     strategy. I was made to believe my investment would triple, it started good and I got returns (not up to what I had invested). Gathered more and involved a couple family members, but I didn't know I was setting myself up for the kill, in less than no time all we had put ($820,000) was gone. It almost seem I had set them up, they came at me strong and hard. After searching and looking for how to make those scums pay back, I got introduced to maryshea03@gmail.com to WhatsApp her +15623847738.who helped recover about 80% of my lost funds within a month.

    ReplyDelete
  7. I rarely share my story with people, not only because it put me at the lowest point ever but because it made me a person of ridicule among family and friends. I put all I had into Binary Options ($690,000) after hearing great testimonies about this new investment

     strategy. I was made to believe my investment would triple, it started good and I got returns (not up to what I had invested). Gathered more and involved a couple family members, but I didn't know I was setting myself up for the kill, in less than no time all we had put ($820,000) was gone. It almost seem I had set them up, they came at me strong and hard. After searching and looking for how to make those scums pay back, I got introduced to maryshea03@gmail.com to WhatsApp her +15623847738.who helped recover about 80% of my lost funds within a month.

    ReplyDelete
  8. If the driver has two-way communication, you may be warned not to switch to the printer, but you have not found anything in the print cartridge. Fortunately, these warnings often tell you what the problem is and recommend tactics to solve it. If everything else is not working, turn off the printer, wait a few minutes, and then restart the printer and software.
    how to setup hp wireless printer on windows 10

    ReplyDelete
  9. Thank you for the post. In the wake of following every one of these means your HP printer will doubtlessly get on the web and will work appropriately. Subsequent to taking a shot at these means you can without much of a stretch take care of every one of your issues. visit our website for any printer model issues like WPS Pin on HP Printer related any issues, etc.

    ReplyDelete
  10. Thank for sharing the such a good information. One most problem in printer is Why my HP Printer is not printing? Their should be many reasons for HP Envy Printer Offline Fix Windows 10 like paper jam, ink tank empty, etc. Get Troubleshooting by expert technicians then call our toll free No. Or visit our official website that will really be helpful to you.

    ReplyDelete
  11. Appreciated your hard work and effort for this Blog. Thanks for being a mentor in this digital-world. Your blog is really helpful and full of knowledge for all of us. If you are using an Ethernet cable, you can directly plug your Roku device into the router. If you have a wireless network, enter the password, and connect your Roku TV Account Activation with it.

    ReplyDelete
  12. Hey buddy, I must say you have written very nice article. Thanks for sharing it. The way you have described everything is phenomenal. You can follow us by visit our Web page How to Install Canon Pixma ip2820 Printer

    ReplyDelete
  13. For those who like to do new things but get money right here. บาคาร่า

    ReplyDelete
  14. Bigpond mail login: If you compare Bigpond email login with other webmail services then you might figure out a variety of options. You can always switch from one email client to another and avail of the services provided by the Bigpond email sign in. When talking about the features provided by Telstra Bigpond email, one should acknowledge the fact that there are a lot of important facets associated with the Bigpond mail login



    ReplyDelete
  15. World's best assignment help only on AtoZAssignment. They are dealing professionally and respect deadlines.

    https://www.atozassignments.com/cplus-programming-assignment-help
    https://www.atozassignments.com/computer-science-assignment
    https://www.atozassignments.com/php-programming-assignment-help
    https://www.atozassignments.com/c-programming-assignment-help
    https://www.atozassignments.com/java-programming-assignment-help
    https://www.atozassignments.com/html-assignment-help
    https://www.atozassignments.com/mechanical-assignment-help

    ReplyDelete
  16. Stunning post! Your post is very useful and quite interesting reading it. Expecting more post like this. Thanks for posting such a good post. To get the Nadi astrology,, Please visit : best nadi astrologer in Tamilnadu

    ReplyDelete
  17. Presenting various kinds of Health schemes or Mediclaim in India offered by Health Insurance company provided by Government organization and Private sector, Get the complete list of Health Insurance schemes offered by Central Government and State Governments. best health insurance company in india Health Insurance was started in 1986 and since then it has been growing in the potential market, and the net worth of total Health Insurance is over billions, whereas it has brought cautiousness about the importance of health insurance for everyone.

    ReplyDelete
  18. The Global Digital Payment Market size was valued at USD 58.30 billion in 2020. It is expected to expand at a compound annual growth rate (CAGR) of 19.4% from 2021 to 2028. The market is expected to benefit from the high adoption rate of smartphones, rise in e-commerce sales, and improved internet penetration globally. Governments across the globe are undertaking initiatives to digitize payments.

    ReplyDelete
  19. The reason the event of Brother Printer Offline issue fluctuates from one printer to another. Now and then, because of fluctuating conditions including Wi-Fi, network availability, and force supply issues. Seldom, it shows specialized error identified with programming and driver. The significant thing you should remember is that user with Printers connected to their PC or PC or Mobile telephones through wireless printing innovation are inclined to regular usefulness issues. It's basic with printers and simple to deal with a couple of simple to-follow steps. Brother printer says offlineis no special case as it additionally includes an equivalent portion of specialized glitches and a surprising technical support framework to assist you with getting dependable arrangements

    ReplyDelete
  20. During the printer establishment, numerous user see the 'Epson Printer WiFi Setup Failed' message springing up on the screen. The error message showing up on the screen demonstrates that your Printer Wi-fi setup is flopped because of some specialized difficulty. Luckily, a user can undoubtedly manage issues with basic apparatuses. Epson printer wifi connection problem On the off chance that your printer is also showing you the 'Epson Printer WiFi Setup Failed' message on the screen, you can continue with the directions referenced in the guide and fix your concern. Along these lines, on the off chance that you would prefer not to return to your old-wired association, peruse and adhere to the guidelines. Assuming you are stressing associating your Epson printer to the remote organization, we will assist you with doing that. Here you can figure out how to set up an Epson printer remotely with no outer assistance. In the wake of introducing the product furnished with your new printer, you can go before setting up your printer to work remotely user WLAN network. This availability doesn't need links and it offers arrangement without the capability of network disappointment. brother printer MFC L2750DW setup

    ReplyDelete
  21. Hello, I have browsed most of your posts. This post is probably where I got the most useful information for my research. Thanks for posting. Play Online Rummy

    ReplyDelete
  22. this is a very helpful resource. i really appritiate and regularly visited. everytime i helped me a lot as i am a programing lover.the guidance corner
    and this site always helped me out.branch of physics

    branch of physics

    ReplyDelete
  23. Get best and professional environmental economics assignment help services with experts. Our assignments writing services are probably the fastest on the web. GoAssignmentHelp.com offer superior quality assignment help, with the team of highly qualified and skilled professionals. Each assignment writer with us needs to go through a specialized process of training such that they can complete the given assignment task within time by complying all the assignment need perfectly. Our team is completely dedicated to handling all kinds of assignments such as essay writing, swift programming assignment help and much more. GoAssignmentHelp.com writers are completely capable of accepting the challenge and meet it and our experts constantly improve their methods of work so, they improve the speed of execution. They know how to choose the most suitable strategy to complete this or that piece of international business assignment help writing. As a result, almost 100% of all the orders made by our clients were written before the time limit was over. You will always be on time with us.

    ReplyDelete
  24. Thank you for sharing the useful post. A reader got a lot of information from this post and utilized it in their research. I also provide independent support for the outlook email. So if you are facing issues with the outlook account then contact me for outlook email support.
    Also Read: Outlook not connecting to server | Outlook send receive error | outlook cannot connect to server | outlook not receiving emails.

    ReplyDelete
  25. Telegram Messenger has been known as the best WhatsApp competitor for a while now and nothing’s changed. The open-source messaging app is still the best whatsappalternatives out there.

    ReplyDelete
  26. Thanks for sharing this Very Helpful Post.

    From the POF customer support team, you’ll get the easy fix and don’t need to hustle on the internet.

    can’t login to POF
    can’t sign up to POF
    pof customer support

    ReplyDelete
  27. Samsung Galaxy Z Fold 2: It can be said that the most powerful and expensive Samsung phone in the Iranian market in the current period is the Samsung Galaxy Z Fold 2, which costs about 40 to 41 million tomans.https://betacup.net/%d8%a8%d8%b1%d8%b1%d8%b3%db%8c-%d9%82%db%8c%d9%85%d8%aa-%da%af%d9%88%d8%b4%db%8c-%d9%87%d8%a7%db%8c-%d8%b3%d8%a7%d9%85%d8%b3%d9%88%d9%86%da%af/%d8%b1%d9%be%d8%b1%d8%aa%d8%a7%da%98/

    ReplyDelete
  28. Reading is my passion. That’s why I become super crazy to see what new is going in your post. I want to see interact you with the navigational link of Assignment Help USA service.

    ReplyDelete
  29. I feel excited to have come across your blog, it is indeed an encyclopedia of knowledge, the contents are always very creative with a lot to learnt from, thanks for always sharing useful contents, you can also checkout this ui cut off mark for medicine

    ReplyDelete
  30. Bitcoin is built on a digitally distributed record called blockchain. As the name implies, a blockchain is a collection of linked data consisting of units called blocks that contain information https://econews.ir/fa/content/3174564 about each transaction, including date and time, total value, buyer and seller, and a unique identification code for each It is an exchange.

    ReplyDelete
  31. we open the networks of Assignment Help Canada for all students who have concerns for their project submission
    Our skillful Canada assignment writers are committed to providing students with top quality online solutions that help in their career advancement.
    We pay respect to your desire and help you submit the best assignment in this area.To provide the best online writing services.
    For more info- Get Assignment Help

    ReplyDelete
  32. Great work. Do you want help with case study assignment help? sourceessay.com will be ideal place to explore numerous blog on different subjects. Do my assignment Melbourne

    ReplyDelete
  33. Do you require online international relations assignment assistance? We've got you covered; we're a renowned and trustworthy academic writing service supplier. Our professional staff delivers high-quality, error-free international relations assignment help.

    ReplyDelete
  34. Now, you should not worry about his purpose and last your query with us. The professional team assignment help Hongkong available with us is to provide the best solution. Now, nobody can stop you to access a better grade. To know more information, you can surf our web address.

    ReplyDelete
  35. I appreciate this article for the well-researched content and excellent wording. I got so interested in this material that I couldn’t stop reading. Your blog is really impressive.สล็อต แตกง่าย

    ReplyDelete
  36. Excellent! Assignments on business communication might help you learn more about a topic. We provide really low-cost Business Communication Assignment Help. Our premiums have been kept modest in order for all students to be able to afford our academic writing services. Students are getting high quality assignment attempts from our Business assignment writers. As a result, students who are currently struggling with unusually worded management duties can quickly reach contact to us for assistance.

    ReplyDelete
  37. This is a very nice blog and learned more knowledge to read this post thanks for sharing this informative post.
    igoal คาสิโนมือถือ

    ReplyDelete
  38. Great blog here! Additionally your website quite a bit up very fast! What web host are you the use of? Can I am getting your affiliate hyperlink on your host? Feel free to visit my website; 바카라

    ReplyDelete
  39. Whoah this weblog is great i like studying your articles. Keep up the good work! Feel free to visit my website; 바카라사이트

    ReplyDelete
  40. I feel excited to have come across your blog, it is indeed an encyclopedia of knowledge, the contents are always very creative with a lot to learnt from, thanks from https://accespedia.com/

    ReplyDelete
  41. Thanks for providing such a great Information, you can see, we also provide https://accespedia.com/

    ReplyDelete
  42. SVG is an XML-based vector image format used for creating and displaying high-quality, resolution-independent graphics on the web. Unlike raster graphics, SVG images can be scaled up or down without losing quality. They are also lightweight, which makes them ideal for use in web design and digital marketing.

    ReplyDelete
  43. I have some extraordinary substance. Keep doing stunning Thank you!!

    ReplyDelete
  44. mxbet มีระบบความปลอดภัยและ น่าเชื่อถือที่สุด pg slot ด้วยนโยบายที่ยึดมั่น “ข้อมูลของลูกค้าคือสิ่งสำคัญ" ผลิตภัณฑ์ ทายผลกีฬา และการเดิมพันออนไลน์

    ReplyDelete
  45. เข้า เล่น pg พีจีสล็อต รองรับโทรศัพท์เคลื่อนที่ทุกระบบ เล่นได้ทุกหนทุกแห่ง แบบเกมนั้น ” รองรับภาษาไทย “ เป็นเกมสล็อตรูปแบบใหม่ pg slot พาทางเข้าเล่นมาให้คุณแล้วคลิกได้เลย

    ReplyDelete
  46. ทางเข้า PG แนะนำทางระบบ PG จะมีวิธีการเข้าสู่ระบบที่แตกต่างจากการเข้าสู่ระบบอื่นๆ เพราะระบบpg slot

    ReplyDelete
  47. ฟรีเครดิต กดรับเองหน้าเว็บไซต์ปัจจุบัน pg slot รับไปเลยเครดิตให้เล่นฟรี 100 บาท ให้ท่านสามารถทดสอบรับเครดิตฟรีกดรับเอง ก็ได้ของ PG SLOT เว็บไซต์ตรงของเรา pg-slot.game

    ReplyDelete
  48. วิธีการใช้ 3xbet ในการพนันออนไลน์ 3xbet เป็นเว็บไซต์ที่มีข้อดีและข้อเสียเหมือนกับเว็บไซต์พนันอื่น ๆ ดังนั้นในบทความนี้เราจะมาทำความรู้จักกับข้อดีและข้อเสียของ PG SLOT

    ReplyDelete
  49. I'm glad i finally found what i was looking for.

    ReplyDelete
  50. I am overwhelmed by your post with such a nice topic.

    ReplyDelete
  51. Hello, good day, this page was very enjoyable for me.

    ReplyDelete
  52. excellent publish, very informative.

    ReplyDelete
  53. You should proceed your writing. I am confident, you’ve a great readers’ base already!

    ReplyDelete

  54. I’m impressed, I must say. Really rarely do I encounter a blog that’s both educative and entertaining.

    ReplyDelete
  55. Thank you for posting a lot of interesting posts.

    ReplyDelete

  56. I wanted to thank you for this excellent read.

    ReplyDelete
  57. I found this one pretty fascinating and it should go into my collection. Very good work! I am Impressed.
    If you are a students and looking for online help for your homework then visit here: Online Homework Help

    ReplyDelete

  58. looking forward for more posts. Thanks Feel free to visit my website;

    ReplyDelete
  59. Your article looks really adorable, here's a site link i dropped for you which you may like.

    ReplyDelete
  60. Very informative article also check my blog post for Top best earnings tools and many more must check

    ReplyDelete
  61. Super site! I am Loving it!! Will return once more, Im taking your sustenance in addition, Thanks

    ReplyDelete
  62. In your article, points caught my attention the most is how your prose, to give me a deep impression.

    ReplyDelete
  63. Wish you would write more. good luck! Feel free to visit my website;

    ReplyDelete
  64. This is the right blog for anyone who wants to find out about this topic.

    ReplyDelete

  65. I’ll oftimes be once more to learn to read much more, many thanks that information

    ReplyDelete

  66. wow, its a incredible information. thanks for sharing.

    ReplyDelete
  67. I must thank you for the efforts you’ve put in writing this blog.

    ReplyDelete
  68. Excellent and nice post. It will beneficial for everyone.

    ReplyDelete

  69. Thanks for sharing such a wonderful post.

    ReplyDelete
  70. Impressive!Thanks for giving me an idea to my site

    ReplyDelete
  71. Very informative idea. There's a lot that can learn here. Thank you so much!

    ReplyDelete
  72. I want to encourage you to continue this great blog writing, have a nice day!

    ReplyDelete
  73. I admire this article for the well-researched content and excellent wording.

    ReplyDelete