Wednesday, October 29, 2014

A taste of Rust (yum) for C/C++ programmers

If, like me, you've been frustrated with the status quo in systems languages, this article will give you a taste of why Rust is so exciting. In a tiny amount of code, it shows a lot of ways that Rust really kicks ass compared to C and C++. It's not just safe and fast, it's a lot more convenient.

Web browsers do string interning to condense the strings that make up the Web, such as tag and attribute names, into small values that can be compared quickly. I recently added event logging support to Servo's string interner. This will allow us to record traces from real websites, which we can use to guide further optimizations.

Here are the events we can log:

#[deriving(Show)]
pub enum Event {
    Intern(u64),
    Insert(u64, String),
    Remove(u64),
}

Interned strings have a 64-bit ID, which is recorded in every event. The String we store for "insert" events is like C++'s std::string; it points to a buffer in the heap, and it owns that buffer.

This enum is a bit fancier than a C enum, but its representation in memory is no more complex than a C struct. There's a tag for the three alternatives, a 64-bit ID, and a few fields that make up the String. When we pass or return an Event by value, it's at worst a memcpy of a few dozen bytes. There's no implicit heap allocation, garbage collection, or anything like that. We didn't define a way to copy an event; this means the String buffer always has a unique owner who is responsible for freeing it.

The deriving(Show) attribute tells the compiler to auto-generate a text representation, so we can print an Event just as easily as a built-in type.

Next we declare a global vector of events, protected by a mutex:

lazy_static! {
    pub static ref LOG: Mutex<Vec<Event>>
        = Mutex::new(Vec::with_capacity(50_000));
}

lazy_static! will initialize both of them when LOG is first used. Like String, the Vec is a growable buffer. We won't turn on event logging in release builds, so it's fine to pre-allocate space for 50,000 events. (You can put underscores anywhere in a integer literal to improve readability.)

lazy_static!, Mutex, and Vec are all implemented in Rust using gnarly low-level code. But the amazing thing is that all three expose a safe interface. It's simply not possible to use the variable before it's initialized, or to read the value the Mutex protects without locking it, or to modify the vector while iterating over it.

The worst you can do is deadlock. And Rust considers that pretty bad, still, which is why it discourages global state. But it's clearly what we need here. Rust takes a pragmatic approach to safety. You can always write the unsafe keyword and then use the same pointer tricks you'd use in C. But you don't need to be quite so guarded when writing the other 95% of your code. I want a language that assumes I'm brilliant but distracted :)

Rust catches these mistakes at compile time, and produces the same code you'd see with equivalent constructs in C++. For a more in-depth comparison, see Ruud van Asseldonk's excellent series of articles about porting a spectral path tracer from C++ to Rust. The Rust code performs basically the same as Clang / GCC / MSVC on the same platform. Not surprising, because Rust uses LLVM and benefits from the same backend optimizations as Clang.

lazy_static! is not a built-in language feature; it's a macro provided by a third-party library. Since the library uses Cargo, I can include it in my project by adding

[dependencies.lazy_static]
git = "https://github.com/Kimundi/lazy-static.rs"

to Cargo.toml and then adding

#[phase(plugin)]
extern crate lazy_static;

to src/lib.rs. Cargo will automatically fetch and build all dependencies. Code reuse becomes no harder than in your favorite scripting language.

Finally, we define a function that pushes a new event onto the vector:

pub fn log(e: Event) {
    LOG.lock().push(e);
}

LOG.lock() produces an RAII handle that will automatically unlock the mutex when it falls out of scope. In C++ I always hesitate to use temporaries like this because if they're destroyed too soon, my program will segfault or worse. Rust has compile-time lifetime checking, so I can do things that would be reckless in C++.

If you scroll up you'll see a lot of prose and not a lot of code. That's because I got a huge amount of functionality for free. Here's the logging module again:

#[deriving(Show)]
pub enum Event {
    Intern(u64),
    Insert(u64, String),
    Remove(u64),
}

lazy_static! {
    pub static ref LOG: Mutex<Vec<Event>>
        = Mutex::new(Vec::with_capacity(50_000));
}

pub fn log(e: Event) {
    LOG.lock().push(e);
}

This goes in src/event.rs and we include it from src/lib.rs.

#[cfg(feature = "log-events")]
pub mod event;

The cfg attribute is how Rust does conditional compilation. Another project can specify

[dependencies.string_cache]
git = "https://github.com/servo/string-cache"
features = ["log-events"]

and add code to dump the log:

for e in string_cache::event::LOG.lock().iter() {
    println!("{}", e);
}

Any project which doesn't opt in to log-events will see zero impact from any of this.

If you'd like to learn Rust, the Guide is a good place to start. We're getting close to 1.0 and the important concepts have been stable for a while, but the details of syntax and libraries are still in flux. It's not too early to learn, but it might be too early to maintain a large library.

By the way, here are the events generated by interning the three strings foobarbaz foo blockquote:

Insert(0x7f1daa023090, foobarbaz)
Intern(0x7f1daa023090)
Intern(0x6f6f6631)
Intern(0xb00000002)

There are three different kinds of IDs, indicated by the least significant bits. The first is a pointer into a standard interning table, which is protected by a mutex. The other two are created without synchronization, which improves parallelism between parser threads.

In UTF-8, the string foo is smaller than a 64-bit pointer, so we store the characters directly. blockquote is too big for that, but it corresponds to a well-known HTML tag. 0xb is the index of blockquote in a static list of strings that are common on the Web. Static atoms can also be used in pattern matching, and LLVM's optimizations for C's switch statements will apply.

33 comments:

  1. Found your blog. Its really have good information on C++ programming. Really liked and appreciate it. Thank you for all the information.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. This comment has been removed by the author.

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. This comment has been removed by the author.

    ReplyDelete

  14. Very interesting. If you have paid the irs penalty abatement form , we can get them back for you. Head over to refundproject.com or call us at: 888-659-0588.

    ReplyDelete
  15. This comment has been removed by the author.

    ReplyDelete
  16. This comment has been removed by the author.

    ReplyDelete

  17. Very interesting. If you have paid the penalty abatement
    , we can get them back for you. Head over to refundproject.com or call us at: 888-659-0588.

    ReplyDelete

  18. Very interesting. If you have paid the penalty abatement
    , we can get them back for you. Head over to refundproject.com or call us at: 888-659-0588.

    ReplyDelete

  19. Very interesting. If you have paid the IRS Penalties, we can get them back for you. Head over to http://refundproject.com or call us at: 888-659-0588.

    ReplyDelete
  20. I must say you had done a tremendous job,I appreciate all your efforts.Thanks alot for your writings......Waiting for a new 1...Please visit our wonderful and valuable website-
    Packers And Movers Bangalore

    ReplyDelete
  21. Packers and Movers Chennai Give Safe and Reliable ***Household Shifting Services in Chennai with Reasonable ###Packers and Movers Price Quotation. We Provide Household Shifting, Office Relocation, ✔✔✔Local and Domestic Transportation Services, Affordable and Reliable Shifting Service Charges @
    Packers And Movers Chennai

    ReplyDelete
  22. Packers and Movers Indore - Call 09303355424, Local Household Shifting in Indore, Domestic Home Relocation from Indore. International Home Relocation from Indore, Office Shifting within Indore, Car and Bike Transportation from Indore, Safe Packing and Moving Services,
    Best Packers and Movers Indore
    Best Packers and Movers Gurgaon
    Best Packers and Movers Kolkata
    Best Packers and Movers Mumbai
    Best Packers and Movers Bhopal
    Best Packers and Movers Delhi
    Best Packers and Movers Jaipur
    Best Packers and Movers Raipur
    Best Packers and Movers Pune
    Best Packers and Movers Ahmedabad

    ReplyDelete
  23. Packers and Movers Hyderabad Give Certified and Verified Service Providers, Cheap and Best ###Office Relocation Charges, ***Home Shifting, ???Goods Insurance worth Rs. 10,000, Assurance for Local and Domestic House Shifting. Safe and Reliable Household Shifting Services in Hyderabad with Reasonable Packers and Movers Price Quotation @
    Packers And Movers Hyderabad

    ReplyDelete
  24. I do not even know how I ended up here, but I thought this post was great.I do not know who you are but certainly you’re going to a famous blogger if you are not already 😉 Cheers!
    VTU Notes

    ReplyDelete
  25. Hi Friends, I have sifted yesterday from Delhi to Mumbai and I reached my destination. Before sifting I was very worried about my households, how I will manage about those items that I started to search Packers and Movers on the internet. I saw so many there that I took a risk because I haven't any idea of Packers and Movers and called to Agarwal Packers and Movers. Their staff came and start doing there work. I can't believe After watching their work I was so impressed and glad too.

    Agarwal Packers Reviews
    Agarwal Packers Feedback
    Agarwal Packers Complaint

    ReplyDelete
  26. kajal-raghwani-biography

    very nice post..
    I like your blog..
    I read regularly

    ReplyDelete