Perusall

Perusall

Perusall is a social web annotation tool intended for use by students at schools and universities. It allows users to annotate the margins of a text in a virtual group setting that is similar to social media—with upvoting, emojis, chat functionality, and notification. It also includes automatic AI grading. == History == Perusall began as a research project at Harvard University. It later became an educational product for students and teachers. As of 2024, Perusall states more than 5 million students have used the tool at over 5,000 educational institutions in 112 countries." == Functionality == Perusall integrates with learning management systems such as Moodle, Canvas and Blackboard to aid with collaborative annotation. The tool supports annotation of a range of media including text, images, equations, videos, PDFs and snapshots of webpages.

Seq2seq

Seq2seq is a family of machine learning approaches used for natural language processing. Originally developed by Lê Viết Quốc, a Vietnamese computer scientist and a machine learning pioneer at Google Brain, this framework has become foundational in many modern AI systems. Applications include language translation, image captioning, conversational models, speech recognition, and text summarization. Seq2seq uses sequence transformation: it turns one sequence into another sequence. == History == One naturally wonders if the problem of translation could conceivably be treated as a problem in cryptography. When I look at an article in Russian, I say: 'This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode. seq2seq is an approach to machine translation (or more generally, sequence transduction) with roots in information theory, where communication is understood as an encode-transmit-decode process, and machine translation can be studied as a special case of communication. This viewpoint was elaborated, for example, in the noisy channel model of machine translation. In practice, seq2seq maps an input sequence into a real-numerical vector by using a neural network (the encoder), and then maps it back to an output sequence using another neural network (the decoder). The idea of encoder-decoder sequence transduction had been developed in the early 2010s. The papers most commonly cited as the originators that produced seq2seq are two papers from 2014. In the seq2seq as proposed by them, both the encoder and the decoder were LSTMs. This had the "bottleneck" problem, since the encoding vector has a fixed size, so for long input sequences, information would tend to be lost, as they are difficult to fit into the fixed-length encoding vector. The attention mechanism, proposed in 2014, resolved the bottleneck problem. They called their model RNNsearch, as it "emulates searching through a source sentence during decoding a translation". A problem with seq2seq models at this point was that recurrent neural networks are difficult to parallelize. The 2017 publication of Transformers resolved the problem by replacing the encoding RNN with self-attention Transformer blocks ("encoder blocks"), and the decoding RNN with cross-attention causally-masked Transformer blocks ("decoder blocks"). === Priority dispute === One of the papers cited as the originator for seq2seq is (Sutskever et al 2014), published at Google Brain while they were on Google's machine translation project. The research allowed Google to overhaul Google Translate into Google Neural Machine Translation in 2016. Tomáš Mikolov claims to have developed the idea (before joining Google Brain) of using a "neural language model on pairs of sentences... and then [generating] translation after seeing the first sentence"—which he equates with seq2seq machine translation, and to have mentioned the idea to Ilya Sutskever and Quoc Le (while at Google Brain), who failed to acknowledge him in their paper. Mikolov had worked on RNNLM (using RNN for language modelling) for his PhD thesis, and is more notable for developing word2vec. == Architecture == The main reference for this section is. === Encoder === The encoder is responsible for processing the input sequence and capturing its essential information, which is stored as the hidden state of the network and, in a model with attention mechanism, a context vector. The context vector is the weighted sum of the input hidden states and is generated for every time instance in the output sequences. === Decoder === The decoder takes the context vector and hidden states from the encoder and generates the final output sequence. The decoder operates in an autoregressive manner, producing one element of the output sequence at a time. At each step, it considers the previously generated elements, the context vector, and the input sequence information to make predictions for the next element in the output sequence. Specifically, in a model with attention mechanism, the context vector and the hidden state are concatenated together to form an attention hidden vector, which is used as an input for the decoder. The seq2seq method developed in the early 2010s uses two neural networks: an encoder network converts an input sentence into numerical vectors, and a decoder network converts those vectors to sentences in the target language. The Attention mechanism was grafted onto this structure in 2014 and is shown below. Later it was refined into the encoder-decoder Transformer architecture of 2017. === Training vs prediction === There is a subtle difference between training and prediction. During training time, both the input and the output sequences are known. During prediction time, only the input sequence is known, and the output sequence must be decoded by the network itself. Specifically, consider an input sequence x 1 : n {\displaystyle x_{1:n}} and output sequence y 1 : m {\displaystyle y_{1:m}} . The encoder would process the input x 1 : n {\displaystyle x_{1:n}} step by step. After that, the decoder would take the output from the encoder, as well as the as input, and produce a prediction y ^ 1 {\displaystyle {\hat {y}}_{1}} . Now, the question is: what should be input to the decoder in the next step? A standard method for training is "teacher forcing". In teacher forcing, no matter what is output by the decoder, the next input to the decoder is always the reference. That is, even if y ^ 1 ≠ y 1 {\displaystyle {\hat {y}}_{1}\neq y_{1}} , the next input to the decoder is still y 1 {\displaystyle y_{1}} , and so on. During prediction time, the "teacher" y 1 : m {\displaystyle y_{1:m}} would be unavailable. Therefore, the input to the decoder must be y ^ 1 {\displaystyle {\hat {y}}_{1}} , then y ^ 2 {\displaystyle {\hat {y}}_{2}} , and so on. It is found that if a model is trained purely by teacher forcing, its performance would degrade during prediction time, since generation based on the model's own output is different from generation based on the teacher's output. This is called exposure bias or a train/test distribution shift. A 2015 paper recommends that, during training, randomly switch between teacher forcing and no teacher forcing. === Attention for seq2seq === The attention mechanism is an enhancement introduced by Bahdanau et al. in 2014 to address limitations in the basic Seq2Seq architecture where a longer input sequence results in the hidden state output of the encoder becoming irrelevant for the decoder. It enables the model to selectively focus on different parts of the input sequence during the decoding process. At each decoder step, an alignment model calculates the attention score using the current decoder state and all of the attention hidden vectors as input. An alignment model is another neural network model that is trained jointly with the seq2seq model used to calculate how well an input, represented by the hidden state, matches with the previous output, represented by attention hidden state. A softmax function is then applied to the attention score to get the attention weight. In some models, the encoder states are directly fed into an activation function, removing the need for alignment model. An activation function receives one decoder state and one encoder state and returns a scalar value of their relevance. Consider the seq2seq language English-to-French translation task. To be concrete, let us consider the translation of "the zone of international control ", which should translate to "la zone de contrôle international ". Here, we use the special token as a control character to delimit the end of input for both the encoder and the decoder. An input sequence of text x 0 , x 1 , … {\displaystyle x_{0},x_{1},\dots } is processed by a neural network (which can be an LSTM, a Transformer encoder, or some other network) into a sequence of real-valued vectors h 0 , h 1 , … {\displaystyle h_{0},h_{1},\dots } , where h {\displaystyle h} stands for "hidden vector". After the encoder has finished processing, the decoder starts operating over the hidden vectors, to produce an output sequence y 0 , y 1 , … {\displaystyle y_{0},y_{1},\dots } , autoregressively. That is, it always takes as input both the hidden vectors produced by the encoder, and what the decoder itself has produced before, to produce the next output word: ( h 0 , h 1 , … {\displaystyle h_{0},h_{1},\dots } , "") → "la" ( h 0 , h 1 , … {\displaystyle h_{0},h_{1},\dots } , " la") → "la zone" ( h 0 , h 1 , … {\displaystyle h_{0},h_{1},\dots } , " la zone") → "la zone de" ... ( h 0 , h 1 , … {\displaystyle h_{0},h_{1},\dots } , " la zone de contrôle international") → "la zone de contrôle international " Here, we use the special token as a control character to delimit the start of input for the decoder. The decoding terminates as soon as "" appears in the decoder output. ==

Graphics address remapping table

The graphics address remapping table (GART), also known as the graphics aperture remapping table, or graphics translation table (GTT), is an I/O memory management unit (IOMMU) used by Accelerated Graphics Port (AGP) and PCI Express (PCIe) graphics cards. The GART allows the graphics card direct memory access (DMA) to the host system memory, through which buffers of textures, polygon meshes and other data are loaded. AMD later reused the same mechanism for I/O virtualization with other peripherals including disk controllers and network adapters. A GART is used as a means of data exchange between the main memory and video memory through which buffers (i.e. paging/swapping) of textures, polygon meshes and other data are loaded, but can also be used to expand the amount of video memory available for systems with only integrated or shared graphics (i.e. no discrete or inbuilt graphics processor), such as Intel HD Graphics processors. However, this type of memory (expansion) remapping has a caveat that affects the entire system: specifically, any GART, pre-allocated memory becomes pooled and cannot be utilised for any other purposes but graphics memory and display rendering. Since PCI Express, the GART is extended to the GTT (Graphics Translation Table), which act as a buffer or cache between system memory and graphics card, and in PCI Express, the GTT buffer size is changeable by the GPU driver. == Operating system support == === Windows === Support for AGP GART was added since Windows 95 OSR2. Later, support for GTT was added since Windows XP SP2 and Windows Vista. === Linux === Jeff Hartmann served as the primary maintainer of the Linux kernel's agpgart driver, which began as part of Brian Paul's Utah GLX accelerated Mesa 3D driver project. The developers primarily targeted Linux 2.4.x kernels, but made patches available against older 2.2.x kernels. Dave Jones heavily reworked agpgart for the Linux 2.6.x kernels, along with more contributions from Jeff Hartmann. === FreeBSD === In FreeBSD, the agpgart driver appeared in its 4.1 release. === Solaris === AGPgart support was introduced into Solaris Express Developer Edition as of its 7/05 release.

Pwnie Awards

The Pwnie Awards are an annual awards ceremony that recognizes both excellence and incompetence in the field of information security, described by SecurityWeek as an event that "recognizes excellence and mocks incompetence in cybersecurity." Winners are selected by a committee of security industry professionals from nominations collected from the information security community. Nominees are announced yearly at Summercon, and the awards themselves are presented at the Black Hat Security Conference. == Origins == The name Pwnie Award is based on the word "pwn", which is hacker slang meaning to "compromise" or "control" based on the previous usage of the word "own" (and it is pronounced similarly). The name "The Pwnie Awards," pronounced as "Pony," is meant to sound like the Tony Awards, an awards ceremony for Broadway theater in New York City. == History == The Pwnie Awards were founded in 2007 by Alexander Sotirov and Dino Dai Zovi following discussions regarding Dino's discovery of a cross-platform QuickTime vulnerability (CVE-2007-2175) and Alexander's discovery of an ANI file processing vulnerability (CVE-2007-0038) in Internet Explorer. == Winners == === 2024 === Most Epic Fail: Crowdstrike for 2024 CrowdStrike incident Best Mobile Bug: Operation Triangulation Lamest Vendor Response: Xiaomi for obstructing Pwn2Own researchers from using their services Best Cryptographic Attack: GoFetch Best Desktop Bug: forcing realtime WebAudio playback in Chrome (CVE-2023-5996) Best Song: Touch Some Grass by UwU Underground Best Privilege Escalation: Windows Streaming Service UAF (CVE-2024-30089) by Valentina Palmiotti (chompie) Best Remote Code Execution: Microsoft Message Queuing (MSMQ) Remote Code Execution Vulnerability (CVE-2024-30080) Most Epic Achievement: Discovery and reverse engineering of the XZ Utils backdoor Most Innovative Research: Let the Cache Cache and Let the WebAssembly Assemble: Knocking’ on Chrome’s Shell by Edouard Bochin, Tao Yan, and Bo Qu Most Underhyped Research: See No Eval: Runtime Dynamic Code Execution in Objective-C === 2023 === Best Desktop Bug: CountExposure! by RyeLv(@b2ahex) Best Cryptographic Attack: Video-based cryptanalysis: Extracting Cryptographic Keys from Video Footage of a Device’s Power LED by Ben Nassi, Etay Iluz, Or Cohen, Ofek Vayner, Dudi Nassi, Boris Zadov, Yuval Elovici Best Song: Clickin’ Most Innovative Research: Inside Apple’s Lightning: Jtagging the iPhone for Fuzzing and Profit Most Under-Hyped Research: Activation Context Cache Poisoning Best Privilege Escalation Bug: URB Excalibur: Slicing Through the Gordian Knot of VMware VM Escapes Best Remote Code Execution Bug: ClamAV RCE Lamest Vendor Response: Three Lessons From Threema: Analysis of a Secure Messenger Most Epic Fail: “Holy fucking bingle, we have the no fly list,” Epic Achievement: Clement Lecigne: 0-days hunter world champion Lifetime Achievement Award: Mudge === 2022 === Lamest Vendor Response: Google's "TAG" response team for "unilaterally shutting down a counterterrorism operation." Epic Achievement: Yuki Chen’s Windows Server-Side RCE Bugs Most Epic Fail: HackerOne Employee Caught Stealing Vulnerability Reports for Personal Gains Best Desktop Bug: Pietro Borrello, Andreas Kogler, Martin Schwarzl, Moritz Lipp, Daniel Gruss, Michael Schwarz for Architecturally Leaking Data from the Microarchitecture Most Innovative Research: Pietro Borrello, Martin Schwarzl, Moritz Lipp, Daniel Gruss, Michael Schwarz for Custom Processing Unit: Tracing and Patching Intel Atom Microcode Best Cryptographic Attack: Hertzbleed: Turning Power Side-Channel Attacks Into Remote Timing Attacks on x86 by Yingchen Wang, Riccardo Paccagnella, Elizabeth Tang He, Hovav Shacham, Christopher Fletcher, David Kohlbrenner Best Remote Code Execution Bug: KunlunLab for Windows RPC Runtime Remote Code Execution (CVE-2022-26809) Best Privilege Escalation Bug: Qidan He of Dawnslab, for Mystique in the House: The Droid Vulnerability Chain That Owns All Your Userspace Best Mobile Bug: FORCEDENTRY Most Under-Hyped Research: Yannay Livneh for Spoofing IP with IPIP Best Song: Dialed Up by Project Mammoth === 2021 === Lamest Vendor Response: Cellebrite, for their response to Moxie, the creator of Signal, reverse-engineering their UFED and accompanying software and reporting a discovered exploit. Epic Achievement: Ilfak Guilfanov, in honor of IDA's 30th Anniversary. Best Privilege Escalation Bug: Baron Samedit of Qualys, for the discovery of a 10-year-old exploit in sudo. Best Song: The Ransomware Song by Forrest Brazeal Best Server-Side Bug: Orange Tsai, for his Microsoft Exchange Server ProxyLogon attack surface discoveries. Best Cryptographic Attack: The NSA for its disclosure of a bug in the verification of signatures in Windows which breaks the certificate trust chain. Most Innovative Research: Enes Göktaş, Kaveh Razavi, Georgios Portokalidis, Herbert Bos, and Cristiano Giuffrida at VUSec for their research on the "BlindSide" Attack. Most Epic Fail: Microsoft, for their failure to fix PrintNightmare. Best Client-Side Bug: Gunnar Alendal's discovery of a buffer overflow on the Samsung Galaxy S20's secure chip. Most Under-Hyped Research: The Qualys Research Team for 21Nails, 21 vulnerabilities in Exim, the Internet's most popular mail server. === 2020 === Best Server-Side Bug: BraveStarr (CVE-2020-10188) – A Fedora 31 netkit telnetd remote exploit (Ronald Huizer') Best Privilege Escalation Bug: checkm8 – A permanent unpatchable USB bootrom exploit for a billion iOS devices. (axi0mX) Epic Achievement: "Remotely Rooting Modern Android Devices" (Guang Gong) Best Cryptographic Attack: Zerologon vulnerability (Tom Tervoort, CVE-2020-1472) Best Client-Side Bug: RCE on Samsung Phones via MMS (CVE-2020-8899 and -16747), a zero click remote execution attack. (Mateusz Jurczyk) Most Under-Hyped Research: Vulnerabilities in System Management Mode (SMM) and Trusted Execution Technology (TXT) (CVE-2019-0151 and -0152) (Gabriel Negreira Barbosa, Rodrigo Rubira Branco, Joe Cihula) Most Innovative Research: TRRespass: When Memory Vendors Tell You Their Chips Are Rowhammer-free, They Are Not. (Pietro Frigo, Emanuele Vannacci, Hasan Hassan, Victor van der Veen, Onur Mutlu, Cristiano Giuffrida, Herbert Bos, Kaveh Razavi) Most Epic Fail: Microsoft; for the implementation of Elliptic-curve signatures which allowed attackers to generate private pairs for public keys of any signer, allowing HTTPS and signed binary spoofing. (CVE-2020-0601) Best Song: Powertrace by Rebekka Aigner, Daniel Gruss, Manuel Weber, Moritz Lipp, Patrick Radkohl, Andreas Kogler, Maria Eichlseder, ElTonno, tunefish, Yuki and Kater Lamest Vendor Response: Daniel J. Bernstein (CVE-2005-1513) === 2019 === Best Server-Side Bug: Orange Tsai and Meh Chang, for their SSL VPN research. Most Innovative Research: Vectorized Emulation Brandon Falk Best Cryptographic Attack: \m/ Dr4g0nbl00d \m/ Mathy Vanhoef, Eyal Ronen Lamest Vendor Response: Bitfi Most Over-hyped Bug: Allegations of Supermicro hardware backdoors, Bloomberg Most Under-hyped Bug: Thrangrycat, (Jatin Kataria, Red Balloon Security) === 2018 === Most Innovative Research: Spectre/Meltdown (Paul Kocher, Jann Horn, Anders Fogh, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwarz, Yuval Yarom) Best Privilege Escalation Bug: Spectre/Meltdown (Paul Kocher, Jann Horn, Anders Fogh, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwarz, Yuval Yarom) Lifetime Achievement: Michał Zalewski Best Cryptographic Attack: ROBOT - Return Of Bleichenbacher’s Oracle Threat Hanno Böck, Juraj Somorovsky, Craig Young Lamest Vendor Response: Bitfi hardware crypto-wallet, after the "unhackable" device was hacked to extract the keys required to steal coins and rooted to play Doom. === 2017 === Epic Achievement: Federico Bento for Finally getting TIOCSTI ioctl attack fixed Most Innovative Research: ASLR on the line Ben Gras, Kaveh Razavi, Erik Bosman, Herbert Bos, Cristiano Giuffrida Best Privilege Escalation Bug: DRAMMER Victor van der Veen, Yanick Fratantonio, Martina Lindorfer, Daniel Gruss, Clementine Maurice, Giovanni Vigna, Herbert Bos, Kaveh Razavi, Cristiano Giuffrida Best Cryptographic Attack: The first collision for full SHA-1 Marc Stevens, Elie Bursztein, Pierre Karpman, Ange Albertini, Yarik Markov Lamest Vendor Response: Lennart Poettering - for mishandling security vulnerabilities most spectacularly for multiple critical Systemd bugs Best Song: Hello (From the Other Side) - Manuel Weber, Michael Schwarz, Daniel Gruss, Moritz Lipp, Rebekka Aigner === 2016 === Most Innovative Research: Dedup Est Machina: Memory Deduplication as an Advanced Exploitation Vector Erik Bosman, Kaveh Razavi, Herbert Bos, Cristiano Giuffrida Lifetime Achievement: Peiter Zatko aka Mudge Best Cryptographic Attack: DROWN attack Nimrod Aviram et al. Best Song: Cyberlier - Katie Mous

MetaMask

MetaMask is a software cryptocurrency wallet developed by ConsenSys for interacting with the Ethereum blockchain and other EVM-compatible networks. It enables users to manage Ethereum accounts and connect to decentralized applications (dApps) via a browser extension or mobile app. As of early 2026, MetaMask reports over 100 million users worldwide. == Overview == MetaMask allows users to store and manage private keys, send and receive Ethereum-based cryptocurrencies and tokens (including ERC-20 and ERC-721 standards), broadcast transactions, and interact with dApps. dApps connect to the wallet via JavaScript interfaces, prompting users to approve signatures or transactions. The wallet features MetaMask Swaps, an in-app token swap aggregator sourcing liquidity from multiple decentralized exchanges (DEXs), with a service fee of 0.875%. In 2025, MetaMask introduced the MetaMask Rewards program (initially mobile-only), where users earn points for activities such as swaps, bridging, and referrals. Season 1 (October 2025 – January 2026) distributed over $30 million in Linea tokens and other perks to participants. == History == MetaMask launched in 2016 as open-source software under the MIT license. It initially supported browser extensions for Chrome and Firefox. Mobile versions were in closed beta from 2019 and publicly released for iOS and Android in September 2020. In August 2020, the license changed to a custom proprietary one. MetaMask Swaps launched on desktop in October 2020 and on mobile in March 2021. The Rewards program launched in late 2025 with Linea integration. == Criticism == MetaMask has faced criticism over privacy, including default analytics settings that share some user data (which can be disabled). Its reliance on Infura (acquired by ConsenSys in 2019) has raised concerns about centralization in Ethereum infrastructure. The wallet regularly issues warnings about phishing scams and fake airdrops impersonating MetaMask.

Speech recognition

Speech recognition (automatic speech recognition (ASR), computer speech recognition, or speech-to-text (STT)) is a sub-field of computational linguistics concerned with methods and technologies that translate spoken language into text or other interpretable forms. Speech recognition applications include voice user interfaces, where the user speaks to a device, which "listens" and processes the audio. Common voice applications include interpreting commands for calling, call routing, home automation, and aircraft control. These applications are called direct voice input. Productivity applications include searching audio recordings, creating transcripts, and dictation. Speech recognition can be used to analyse speaker characteristics, such as identifying native language using pronunciation assessment. Voice recognition (speaker identification) refers to identifying the speaker, rather than speech contents. Recognizing the speaker can simplify the task of translating speech in systems trained on a specific person's voice. It can also be used to authenticate the speaker as part of a security process. == History == Applications for speech recognition developed over many decades, with progress accelerated due to advances in deep learning and the use of big data. These advances are reflected in an increase in academic papers, and greater system adoption. Key areas of growth include vocabulary size, more accurate recognition for unfamiliar speakers (speaker independence), and faster processing speed. === Pre-1970 === 1952 – Bell Labs researchers, Stephen Balashek, R. Biddulph, and K. H. Davis, built Audrey for single-speaker digit recognition. Their system located the formants in the power spectrum of each utterance. 1960 – Gunnar Fant developed and published the source–filter model of speech production. 1962 – IBM's 16-word "Shoebox" machine's speech recognition debuted at the 1962 World's Fair. 1966 – Linear predictive coding, a speech coding method, was proposed by Fumitada Itakura of Nagoya University and Shuzo Saito of Nippon Telegraph and Telephone. 1969 – Funding at Bell Labs came to a halt for several years after the company's head engineer, John R. Pierce, wrote an open letter criticizing speech recognition research. This defunding lasted until Pierce retired and James L. Flanagan took over. Raj Reddy was the first person to work on continuous speech recognition, as a graduate student at Stanford University in the late 1960s. Previous systems required users to pause after each word. Reddy's system issued spoken commands for playing chess. Around this time, Soviet researchers invented the dynamic time warping (DTW) algorithm and used it to create a recognizer capable of operating on a 200-word vocabulary. DTW processed speech by dividing it into short frames (e.g. 10 ms segments) and treating each frame as a unit. Speaker independence, however, remained unsolved. === 1970–1990 === 1971 – DARPA funded a five-year speech recognition research project, Speech Understanding Research, seeking a minimum vocabulary size of 1,000 words. The project considered speech understanding a key to achieving progress in speech recognition, which was later disproved. BBN, IBM, Carnegie Mellon (CMU), and Stanford Research Institute participated. 1972 – The IEEE Acoustics, Speech, and Signal Processing group held a conference in Newton, Massachusetts. 1976 – The first ICASSP was held in Philadelphia, which became a major venue for publishing on speech recognition. During the late 1960s, Leonard Baum developed the mathematics of Markov chains at the Institute for Defense Analysis. A decade later, at CMU, Raj Reddy's students James Baker and Janet M. Baker began using the hidden Markov model (HMM) for speech recognition. James Baker had learned about HMMs while at the Institute for Defense Analysis. HMMs enabled researchers to combine sources of knowledge, such as acoustics, language, and syntax, in a unified probabilistic model. By the mid-1980s, Fred Jelinek's team at IBM created a voice-activated typewriter called Tangora, which could handle a 20,000-word vocabulary. Jelinek's statistical approach placed less emphasis on emulating human brain processes in favor of statistical modelling. (Jelinek's group independently discovered the application of HMMs to speech.) This was controversial among linguists since HMMs are too simplistic to account for many features of human languages. However, the HMM proved to be a highly useful way for modelling speech and replaced dynamic time warping as the dominant speech recognition algorithm in the 1980s. 1982 – Dragon Systems, founded by James and Janet M. Baker, was one of IBM's few competitors. === Practical speech recognition === The 1980s also saw the introduction of the n-gram language model. 1987 – The back-off model enabled language models to use multiple-length n-grams, and CSELT used HMM to recognize languages (in software and hardware, e.g. RIPAC). At the end of the DARPA program in 1976, the best computer available to researchers was the PDP-10 with 4 MB of RAM. It could take up to 100 minutes to decode 30 seconds of speech. Practical products included: 1984 – the Apricot Portable was released with up to 4096 words support, of which only 64 could be held in RAM at a time. 1987 – a recognizer from Kurzweil Applied Intelligence 1990 – Dragon Dictate, a consumer product released in 1990. AT&T deployed the Voice Recognition Call Processing service in 1992 to route telephone calls without a human operator. The technology was developed by Lawrence Rabiner and others at Bell Labs. By the early 1990s, the vocabulary of the typical commercial speech recognition system had exceeded the average human vocabulary. Reddy's former student, Xuedong Huang, developed the Sphinx-II system at CMU. Sphinx-II was the first to do speaker-independent, large vocabulary, continuous speech recognition, and it won DARPA's 1992 evaluation. Handling continuous speech with a large vocabulary was a major milestone. Huang later founded the speech recognition group at Microsoft in 1993. Reddy's student Kai-Fu Lee joined Apple, where, in 1992, he helped develop the Casper speech interface prototype. Lernout & Hauspie, a Belgium-based speech recognition company, acquired other companies, including Kurzweil Applied Intelligence in 1997 and Dragon Systems in 2000. L&H was used in Windows XP. L&H was an industry leader until an accounting scandal destroyed it in 2001. L&H speech technology was bought by ScanSoft, which became Nuance in 2005. Apple licensed Nuance software for its digital assistant Siri. ==== 2000s ==== In the 2000s, DARPA sponsored two speech recognition programs: Effective Affordable Reusable Speech-to-Text (EARS) in 2002, followed by Global Autonomous Language Exploitation (GALE) in 2005. Four teams participated in EARS: IBM; a team led by BBN with LIMSI and the University of Pittsburgh; Cambridge University; and a team composed of ICSI, SRI, and the University of Washington. EARS funded the collection of the Switchboard telephone speech corpus, which contained 260 hours of recorded conversations from over 500 speakers. The GALE program focused on Arabic and Mandarin broadcast news. Google's first effort at speech recognition came in 2007 after recruiting Nuance researchers. Its first product, GOOG-411, was a telephone-based directory service. Since at least 2006, the U.S. National Security Agency has employed keyword spotting, allowing analysts to index large volumes of recorded conversations and identify speech containing "interesting" keywords. Other government research programs focused on intelligence applications, such as DARPA's EARS program and IARPA's Babel program. In the early 2000s, speech recognition was dominated by hidden Markov models combined with feed-forward artificial neural networks (ANN). Later, speech recognition was taken over by long short-term memory (LSTM), a recurrent neural network (RNN) published by Sepp Hochreiter & Jürgen Schmidhuber in 1997. LSTM RNNs avoid the vanishing gradient problem and can learn "Very Deep Learning" tasks that require memories of events that happened thousands of discrete time steps earlier, which is important for speech. Around 2007, LSTMs trained with Connectionist Temporal Classification (CTC) began to outperform. In 2015, Google reported a 49 percent error‑rate reduction in its speech recognition via CTC‑trained LSTM. Transformers, a type of neural network based solely on attention, were adopted in computer vision and language modelling, and then to speech recognition. Deep feed-forward (non-recurrent) networks for acoustic modelling were introduced in 2009 by Geoffrey Hinton and his students at the University of Toronto, and by Li Deng and colleagues at Microsoft Research. In contrast to the prioer incremental improvements, deep learning decreased error rates by 30%. Both shallow and deep forms (e.g., recurrent nets) of ANNs had been explored since the 1980s. Howev

Random (software)

Random was an iOS mobile app that used algorithms and human-curation to create an adaptive interface to the Internet. The app served a remix of relevance and serendipity that allowed people to find diverse topics and interesting content that they might not have encountered otherwise. Random did not require a login or sign-up - the use of the app was anonymous. The app was powered by an artificial intelligence that learned from direct and indirect user interactions inside the app. While learning and adapting to a person, Random created a unique anonymous choice profile that was then used for recommending topics and content. The app didn't recommend the same content twice. == User interface == Random's user interface was made of ever-changing topic blocks that contained keywords and images. By choosing any of the blocks, the user would see related web content. By closing the web content, the user could access new related topics. The user interface allowed people to get more information about a specific topic area or then just leap freely from topic to topic. The content recommended by Random could be any type of web content, varying from news articles to long-form stories and from photographs to videos. Every user of the Random was curating content for other users by using the app. == History == Random was launched in March 2014. The startup was backed by Skype co-founder Janus Friis. The Random app received a strong reception from the likes of The New York Times, TechCrunch, New Scientist, Vice, and other leading publications. The app went on to gain traction with an active and loyal user community of several hundreds of thousands. This was not enough to support the free app model the team strongly believed in, and the service was terminated in December 2015. == Reception == Various reviews in media have emphasized that Random enables people to break their filter bubble and find diverse content they might not find elsewhere. Alan Henry of Lifehacker wrote: "Random... breaks you out by intentionally guiding you to new topics and interesting articles at sites you may not otherwise read." Vice Motherboard's Claire Evans says that: "Random never turns into a filter bubble, because it perpetually injects the irrational into my experience… in a cocktail of relevancy and serendipity." The app has been said to have a unique, minimalistic user experience. Kit Eaton of The New York Times commented that Random "let's you browse the news in a different way to all the other news sites you've probably ever used." Mashable reviewed Random by concluding that the "app may be one of the most simple content-discovery apps on the market."