Kurt Keutzer
#43,717
Most Influential Person Now
American computer scientist
Kurt Keutzer's AcademicInfluence.com Rankings
Kurt Keutzercomputer-science Degrees
Computer Science
#1780
World Rank
#1845
Historical Rank
#848
USA Rank
Information Systems
#12
World Rank
#14
Historical Rank
#5
USA Rank
Computer Architecture
#22
World Rank
#22
Historical Rank
#17
USA Rank
Database
#1025
World Rank
#1079
Historical Rank
#298
USA Rank
Download Badge
Computer Science
Kurt Keutzer's Degrees
- PhD Electrical Engineering and Computer Science University of California, Berkeley
- Masters Electrical Engineering and Computer Science University of California, Berkeley
- Bachelors Electrical Engineering and Computer Science University of California, Berkeley
Why Is Kurt Keutzer Influential?
(Suggest an Edit or Addition)According to Wikipedia, Kurt Keutzer is an American computer scientist. Early life and education Kurt Keutzer grew up in Indianapolis, Indiana. He earned a bachelor's degree in mathematics from Maharishi University of Management in 1978, and a PhD in computer science from Indiana University in 1984.
Kurt Keutzer's Published Works
Published Works
- SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size (2016) (5225)
- The Landscape of Parallel Computing Research: A View from Berkeley (2006) (2362)
- FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search (2018) (927)
- System-Level Design: Orthogonalization of Concerns and Platform-Based Design (2001) (794)
- - LEVEL ACCURACY WITH 50 X FEWER PARAMETERS AND < 0 . 5 MB MODEL SIZE (2016) (672)
- A view of the parallel computing landscape (2009) (653)
- DenseNet: Implementing Efficient ConvNet Descriptor Pyramids (2014) (618)
- Large Batch Optimization for Deep Learning: Training BERT in 76 minutes (2019) (577)
- SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud (2017) (557)
- Estimation of average switching activity in combinational and sequential circuits (1992) (549)
- Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow (2010) (486)
- Parallel CAD: Algorithm Design and Programming Special Section Call for Papers TODAES: ACM Transactions on Design Automation of Electronic Systems (2010) (479)
- SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving (2016) (441)
- System-level design: orthogonalization of concerns andplatform-based design (2000) (415)
- Fast support vector machine training and classification on graphics processors (2008) (410)
- Addressing the system-on-a-chip interconnect woes through communication-based design (2001) (405)
- SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud (2018) (379)
- ImageNet Training in Minutes (2017) (365)
- Getting to the bottom of deep submicron (1998) (337)
- Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT (2019) (337)
- DAGON: Technology binding and local optimization by DAG matching (1987) (328)
- A Survey of Quantization Methods for Efficient Neural Network Inference (2021) (304)
- FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters (2015) (284)
- Bus encoding to prevent crosstalk delay (2001) (282)
- Fast $\ell_1$ -SPIRiT Compressed Sensing Parallel Imaging MRI: Scalable Parallel Implementation and Clinically Feasible Runtime (2012) (278)
- HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision (2019) (271)
- Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions (2017) (270)
- DAGON: Technology Binding and Local Optimization by DAG Matching (1987) (258)
- Visual Transformers: Token-based Image Representation and Processing for Computer Vision (2020) (255)
- On average power dissipation and random pattern testability of CMOS combinational logic networks (1992) (233)
- A general probabilistic framework for worst case timing analysis (2002) (223)
- Copperhead: compiling an embedded data parallel language (2011) (219)
- Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization Without Accessing Target Domain Data (2019) (216)
- Coverage Metrics for Functional Validation of Hardware Designs (2001) (210)
- ZeroQ: A Novel Zero Shot Quantization Framework (2020) (207)
- SqueezeNext: Hardware-Aware Neural Network Design (2018) (202)
- Storage assignment to decrease code size (1996) (189)
- Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search (2018) (184)
- A global wiring paradigm for deep submicron design (2000) (184)
- SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation (2020) (178)
- Impact of spatial intrachip gate length variability on theperformance of high-speed digital circuits (2002) (175)
- Minimization of dynamic and static power through joint assignment of threshold voltages and sizing optimization [logic IC design] (2003) (170)
- Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers (2020) (169)
- How Much Can CLIP Benefit Vision-and-Language Tasks? (2021) (168)
- Estimation of power dissipation in CMOS combinational circuits using Boolean function manipulation (1992) (165)
- Practical parallel imaging compressed sensing MRI: Summary of two years of experience in accelerating body MRI of pediatric patients (2011) (161)
- OCCOM: efficient computation of observability-based code coverage metrics for functional verification (1998) (153)
- Robust delay-fault test generation and synthesis for testability under a standard scan design methodology (1991) (150)
- PyHessian: Neural Networks Through the Lens of the Hessian (2019) (144)
- From ASIC to ASIP: the next design discontinuity (2002) (141)
- Efficient, high-quality image contour detection (2009) (140)
- A LiDAR Point Cloud Generator: from a Virtual World to Autonomous Driving (2018) (140)
- clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs (2012) (137)
- Impact of small process geometries on microarchitectures in systems on a chip (2001) (134)
- Algorithms for synthesis of hazard-free asynchronous circuits (1991) (132)
- Code density optimization for embedded DSP processors using data compression techniques (1995) (128)
- ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs (2019) (128)
- Hessian-based Analysis of Large Batch Training and Robustness to Adversaries (2018) (127)
- An observability-based code coverage metric for functional simulation (1996) (124)
- HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks (2019) (124)
- SEJITS: Getting Productivity and Performance With Selective Embedded JIT Specialization (2010) (120)
- Multi-source Domain Adaptation for Semantic Segmentation (2019) (118)
- How to scale distributed deep learning? (2016) (115)
- ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning (2020) (114)
- Towards true crosstalk noise analysis (1999) (111)
- Statistical timing analysis of combinational logic circuits (1993) (109)
- Multi-source Distilling Domain Adaptation (2019) (109)
- Closing the Gap Between ASIC and Custom - Tools and Techniques for High-Performance ASIC Design (2002) (109)
- Logic Synthesis (1994) (108)
- A design pattern language for engineering (parallel) software: merging the PLPP and OPL projects (2010) (107)
- Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization (2019) (107)
- Getting to the bottom of deep submicron II: a global wiring paradigm (1999) (105)
- A quick safari through the reconfiguration jungle (2001) (104)
- Communication-Avoiding QR Decomposition for GPUs (2011) (100)
- Computation of floating mode delay in combinational circuits: theory and algorithms (1993) (97)
- Impact of systematic spatial intra-chip gate length variability on performance of high-speed digital circuits (2000) (96)
- NP-Click: A Programming Model for the Intel IXP1200 (2004) (96)
- Instruction selection using binate covering for code size optimization (1995) (96)
- Synthesis of robust delay-fault-testable circuits: theory (1992) (94)
- A Review of Single-Source Deep Unsupervised Visual Domain Adaptation (2020) (93)
- Miller factor for gate-level coupling delay calculation (2000) (93)
- Functional vector generation for HDL models using linear programming and 3-satisfiability (1998) (92)
- HAWQV3: Dyadic Neural Network Quantization (2020) (91)
- Efficient Parallelization of H.264 Decoding with Macro Block Level Scheduling (2007) (88)
- Estimation of average switching activity in combinational logic circuits using symbolic simulation (1997) (88)
- An automated exploration framework for FPGA-based soft multiprocessor systems (2005) (86)
- Is redundancy necessary to reduce delay? (1990) (85)
- Certified timing verification and the transition delay of a logic circuit (1992) (85)
- Code Optimization Techniques for Embedded DSP Microprocessors (1995) (85)
- Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs (2018) (84)
- I-BERT: Integer-only BERT Quantization (2021) (84)
- The Concurrency Challenge (2008) (83)
- Large-batch training for LSTM and beyond (2019) (79)
- A map reduce framework for programming graphics processors (2010) (79)
- SqueezeBERT: What can computer vision teach NLP about efficient neural networks? (2020) (79)
- Linear programming for sizing, Vth and Vdd assignment (2005) (79)
- Closing the power gap between ASIC and custom: an ASIC perspective (2000) (78)
- DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer (2015) (78)
- Limitations and challenges of computer-aided design technology for CMOS VLSI (2001) (75)
- A Partial Enhanced-Scan Approach to Robust Delay-Fault Test Generation for Sequential Circuits (1991) (74)
- Delay computation in combinational logic circuits: theory and algorithms (1991) (73)
- An FPGA-based soft multiprocessor system for IPv4 packet forwarding (2005) (72)
- Why is ATPG easy? (1999) (71)
- Building ASIPs: The Mescal Methodology (2006) (68)
- System-Level Performance Modeling with BACPAC - Berkeley Advanced Chip Performance Calculator (1999) (68)
- Integrated Model, Batch, and Domain Parallelism in Training Neural Networks (2017) (66)
- Region Similarity Representation Learning (2021) (66)
- EmotionGAN: Unsupervised Domain Adaptation for Learning Discrete Probability Distributions of Image Emotions (2018) (66)
- The Parallel Computing Laboratory at U.C. Berkeley: A Research Agenda Based on the Berkeley View (2008) (65)
- Affective Image Content Analysis: A Comprehensive Survey (2018) (64)
- Closing the gap between ASIC and custom: an ASIC perspective (2000) (63)
- Multi-source Domain Adaptation in the Deep Learning Era: A Systematic Survey (2020) (62)
- Code Generation and Optimization Techniques for Embedded Digital Signal Processors (1996) (58)
- A text-compression-based method for code size minimization in embedded systems (1999) (57)
- Improving cell libraries for synthesis (1994) (57)
- Synthesis of robust delay-fault-testable circuits: practice (1992) (55)
- Rethinking Deep-Submicron Circuit Design (1999) (54)
- Switching window computation for static timing analysis in presence of crosstalk noise (2000) (54)
- Our Pattern Language ( OPL ) : A Design Pattern Language for Engineering ( Parallel ) Software (2009) (53)
- A functional validation technique: biased-random simulation guided by observability-based coverage (2001) (53)
- NP-Click: a productive software development approach for network processors (2004) (53)
- Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation (2021) (52)
- A Predictive Model for Solving Small Linear Algebra Problems in GPU Registers (2012) (52)
- A fully data parallel WFST-based large vocabulary continuous speech recognition on a graphics processing unit (2009) (52)
- Data-Parallel Large Vocabulary Continuous Speech Recognition on Graphics Processors (2008) (52)
- Self-Supervised Pretraining Improves Self-Supervised Pretraining (2021) (51)
- Closing the Power Gap Between ASIC & Custom (2007) (50)
- Counterexample-Guided Data Augmentation (2018) (50)
- Parallelizing CAD: A timely research agenda for EDA (2008) (49)
- Parallel scalability in speech recognition (2009) (49)
- A unified approach to the synthesis of fully testable sequential machines (1990) (48)
- Gate-delay-fault testability properties of multiplexor-based networks (1991) (48)
- Personalized Emotion Recognition by Personality-Aware High-Order Learning of Physiological Signals (2019) (48)
- CycleEmotionGAN: Emotional Semantic Consistency Preserved CycleGAN for Adapting Image Emotions (2019) (46)
- Shallow Networks for High-accuracy Road Object-detection (2016) (46)
- Computation of floating mode delay in combinational circuits: practice and implementation (1993) (46)
- Synthesis and optimization procedures for robustly delay-fault testable combinational logic circuits (1990) (45)
- Developing Architectural Platforms: A Disciplined Approach (2002) (45)
- PowerNorm: Rethinking Batch Normalization in Transformers (2020) (45)
- Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation (2020) (45)
- A Novel Domain Adaptation Framework for Medical Image Segmentation (2018) (44)
- Closing the Power Gap between ASIC and Custom - Tools and Techniques for Low Power Design (2005) (43)
- Trust Region Based Adversarial Attack on Neural Networks (2018) (43)
- Comparing analytical modeling with simulation for network processors: a case study (2003) (43)
- On properties of algebraic transformation and the multifault testability of multilevel logic (1989) (43)
- Regret Minimization for Partially Observable Deep Reinforcement Learning (2017) (42)
- A Decomposition-based Constraint Optimization Approach for Statically Scheduling Task Graphs with Communication Delays to Multiprocessors (2007) (42)
- An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos (2020) (41)
- Validatable nonrobust delay-fault testable circuits via logic synthesis (1990) (41)
- Testability-preserving circuit transformations (1990) (40)
- The impact of CAD on the design of low power digital circuits (1994) (40)
- Low power multiplication algorithm for switching activity reduction through operand decomposition (2003) (40)
- A Benchmarking Methodology for Network Processors (2003) (40)
- Storage assignment to decrease code size (1995) (39)
- Statistical timing analysis of combinational circuits (1992) (38)
- Synthesis of hazard-free asynchronous circuits with bounded wire delays (1995) (38)
- Clinically Feasible Reconstruction Time for L 1-SPIRiT Parallel Imaging and Compressed Sensing MRI (2009) (36)
- ANODEV2: A Coupled Neural ODE Framework (2019) (35)
- ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework for LiDAR Point Cloud Segmentation (2020) (35)
- Chapter 7 – Exploring Trade-Offs in Performance and Programmability of Processing Element Topologies for Network Processors (2003) (35)
- Ubiquitous Parallel Computing from Berkeley, Illinois, and Stanford (2010) (34)
- Programming challenges in network processor deployment (2003) (34)
- The future of logic synthesis and physical design in deep-submicron process geometries (1997) (33)
- Scheduling task dependence graphs with variable task execution times onto heterogeneous multiprocessors (2008) (32)
- Mapping Concurrent Applications onto Architectural Platforms (2003) (32)
- Anatomy of a hardware compiler (1988) (31)
- Event suppression: improving the efficiency of timing simulation for synchronous digital circuits (1994) (31)
- ANODEV2: A Coupled Neural ODE Evolution Framework (2019) (31)
- Keynote: small neural nets are beautiful: enabling embedded systems with small deep-neural- network architectures (2017) (31)
- Challenges in code generation for embedded processors (1994) (31)
- Synthesis of verifiably hazard-free asynchronous control circuits (1991) (31)
- Large batch size training of neural networks with adversarial training and second-order information (2018) (31)
- Functional vector generation for HDL models using linearprogramming and Boolean satisfiability (2001) (30)
- Algorithms and Techniques for VLSI Layout and Synthesis (1988) (30)
- SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning (2020) (30)
- Synthesis for testability techniques for asynchronous circuits (1991) (30)
- Learned Token Pruning for Transformers (2021) (30)
- Testability properties of multilevel logic networks derived from binary decision diagrams (1991) (29)
- Invariant Information Bottleneck for Domain Generalization (2021) (29)
- OCCOM-efficient computation of observability-based code coveragemetrics for functional verification (2001) (29)
- Network Processors: Origin of Species (2002) (29)
- Long term video segmentation through pixel level spectral clustering on GPUs (2011) (28)
- Communication-minimizing 2D convolution in GPU registers (2013) (28)
- Fast Deep Neural Network Training on Distributed Systems and Cloud TPUs (2019) (28)
- PDANet: Polarity-consistent Deep Attention Network for Fine-grained Visual Emotion Regression (2019) (27)
- LATTE: Accelerating LiDAR Point Cloud Annotation via Sensor Fusion, One-Click Annotation, and Tracking (2019) (27)
- A kernel-finding state assignment algorithm for multi-level logic (1988) (27)
- Hessian-Aware Pruning and Optimal Neural Implant (2021) (27)
- Rethinking Distributional Matching Based Domain Adaptation (2020) (27)
- Acceleration of market value-at-risk estimation (2009) (27)
- CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs (2021) (27)
- Invited: Co-Design of Deep Neural Nets and Neural Net Accelerators for Embedded Vision Applications (2018) (26)
- Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets (2021) (25)
- From blind certainty to informed uncertainty (2002) (25)
- Verification of asynchronous interface circuits with bounded wire delays (1992) (24)
- A new viewpoint on code generation for directed acyclic graphs (1998) (24)
- Functional Vector Generation for HDL Models Using (2001) (24)
- Design verification and reachability analysis using algebraic manipulation (1991) (23)
- Emotion Recognition From Multiple Modalities: Fundamentals and methodologies (2021) (23)
- K-LITE: Learning Transferable Visual Models with External Knowledge (2022) (23)
- Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection (2022) (22)
- SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis (2020) (22)
- On properties of algebraic transformations and the synthesis of multifault-irredundant circuits (1992) (22)
- Multi-Agent Collaboration via Reward Attribution Decomposition (2020) (21)
- Automated Task Allocation for Network Processors (2004) (21)
- Refining switching window by time slots for crosstalk noise calculation (2002) (21)
- Hardware-Software Co-Design and ESDA (1994) (21)
- BeBold: Exploration Beyond the Boundary of Explored Regions (2020) (20)
- Boundary thickness and robustness in learning models (2020) (20)
- An automata-theoretic approach to behavioral equivalence (1990) (20)
- Affective Image Content Analysis: Two Decades Review and New Perspectives (2021) (20)
- A disciplined approach to the development of platform architectures (2002) (20)
- NovelD: A Simple yet Effective Exploration Criterion (2021) (20)
- Inefficiency of K-FAC for Large Batch Size Training (2019) (20)
- MADAN: Multi-source Adversarial Domain Aggregation Network for Domain Adaptation (2020) (20)
- Efficient Automatic Speech Recognition on the GPU (2011) (19)
- Exploring recognition network representations for efficient speech inference on highly parallel platforms (2010) (19)
- Design of integrated circuits fully testable for delay-faults and multifaults (1990) (19)
- Three competing design methodologies for ASIC's: architectual synthesis, logic synthesis, logic synthesis and module generation (1989) (18)
- Fast speaker diarization using a high-level scripting language (2011) (18)
- Efficient Parallel CKY Parsing on GPUs (2011) (18)
- Achieving 550 MHz in an ASIC methodology (2001) (17)
- Automatic Replacement of Flip-Flops by Latches in ASICs (2004) (17)
- ImageNet Training in 24 Minutes (2017) (17)
- Rethinking Batch Normalization in Transformers (2020) (16)
- Audio-Based Multimedia Event Detection with DNNs and Sparse Sampling (2015) (16)
- Scalable HMM based inference engine in large vocabulary continuous speech recognition (2009) (16)
- Cross-Domain Sentiment Classification with Contrastive Learning and Mutual Information Maximization (2020) (16)
- A Design and Validation System for Asynchronous Circuits (1995) (15)
- HAO: Hardware-aware Neural Architecture Optimization for Efficient Inference (2021) (15)
- You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module (2021) (15)
- Image feature extraction for mobile processors (2009) (15)
- Boolean minimization and algebraic factorization procedures for fully testable sequential machines (1989) (14)
- Automatic Specialization of Actor-oriented Models in Ptolemy II (2002) (13)
- A quick safari through the reconfiguration jungle (Invited) (2001) (13)
- Scripting for EDA tools: a case study (2001) (13)
- Boda: A Holistic Approach for Implementing Neural Network Computations (2017) (12)
- Automated Task Allocation on Single Chip, Hardware Multithreaded, Multiprocessor Systems (2004) (12)
- Emotion-Based End-to-End Matching Between Image and Music in Valence-Arousal Space (2020) (12)
- Optimizing the use of GPU memory in applications with large data sets (2009) (12)
- Linear programming for sizing, V/sub th/ and V/sub dd/ assignment (2005) (12)
- Squeezeformer: An Efficient Transformer for Automatic Speech Recognition (2022) (11)
- Compile time task and resource allocation of concurrent applications to multiprocessor platforms (2009) (11)
- Panel: cell libraries - build vs. buy; static vs. dynamic (1999) (11)
- Code Optimization Techniques in Embedded DSP Microprocessors (1998) (11)
- Are single-chip multiprocessors in reach? (2001) (11)
- Minimum-power retiming for dual-supply CMOS circuits (2002) (11)
- Task allocation and scheduling of concurrent applications to multiprocessor systems (2008) (10)
- Three Competing Design Methodologies for ASIC's: Architectural Synthesis, Logic Synthesis and Module Generation (1989) (10)
- Integer-Only Zero-Shot Quantization for Efficient Speech Recognition (2021) (10)
- A Fast Post-Training Pruning Framework for Transformers (2022) (10)
- Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models (2021) (10)
- Using minimal minterms to represent programmability (2005) (10)
- Visual Transformers: Where Do Transformers Really Belong in Vision Models? (2021) (10)
- Efficient manycore CHMM speech recognition for audiovisual and multistream data (2010) (10)
- Automatic generation of application-specific accelerators for FPGAs from python loop nests (2012) (10)
- MLPruning: A Multilevel Structured Pruning Framework for Transformer-based Models (2021) (10)
- Monte Carlo–Based Financial Market Value-at-Risk Estimation on GPUs (2012) (9)
- Developing a Flexible Interface for RapidIO, Hypertransport, and PCI-Express (2004) (9)
- EmotionGAN (2018) (9)
- What is the state of the art in commercial EDA tools for low power? (1996) (9)
- The Need for Formal Methods for Integrated Circuit Design (1996) (9)
- Reinventing EDA with manycore processors (2008) (9)
- Parallel computing with patterns and frameworks (2010) (8)
- Emotional Semantics-Preserved and Feature-Aligned CycleGAN for Visual Emotion Adaptation (2020) (8)
- Considerations When Evaluating Microprocessor Platforms (2011) (8)
- Evaluating Self-Supervised Pretraining Without Using Labels (2020) (8)
- Overview of the Factors Affecting the Power Consumption (2007) (8)
- Integrated Model and Data Parallelism in Training Neural Networks (2017) (8)
- Parameter Re-Initialization through Cyclical Batch Size Schedules (2018) (8)
- Design Tools for Application Specific Embedded Processors (2002) (8)
- Boda-RTC: Productive generation of portable, efficient code for convolutional neural networks on mobile computing platforms (2016) (8)
- Accelerating Value‐at‐Risk estimation on highly parallel architectures (2012) (8)
- Spatially Parallel Convolutions (2018) (8)
- Parallel BFS graph traversal on images using structured grid (2010) (8)
- Curriculum CycleGAN for Textual Sentiment Domain Adaptation with Multiple Sources (2020) (8)
- Challenges in CAD for the one million gate FPGA (1997) (7)
- Staged Training for Transformer Language Models (2022) (7)
- Reducing the Timing Overhead (2004) (7)
- Algorithm-hardware Co-design for Deformable Convolution (2019) (7)
- libHOG: Energy-Efficient Histogram of Oriented Gradient Computation (2015) (6)
- Reservoir Transformers (2020) (6)
- Cross-Domain Object Detection with Mean-Teacher Transformer (2022) (6)
- The ArtBench Dataset: Benchmarking Generative Models with Artworks (2022) (6)
- Monte Carlo methods: a computational pattern for our pattern language (2010) (6)
- Overview of the IDA System: A Toolset for VLSI Layout Synthesis (1987) (6)
- Towards a flexible network processor interface for rapidIO, hypertransport, and PCI-Express (2005) (6)
- Scalable multimedia content analysis on parallel platforms using python (2014) (6)
- Fast integration of eda tools and scripting language (2001) (6)
- A synthesis-based test generation and compaction algorithm for multifaults (1991) (6)
- The MARCO/DARPA Gigascale Silicon Research Center (1999) (6)
- Recognition of Tibetan wood block prints with generalized hidden Markov and kernelized modified quadratic distance function (2011) (6)
- Guest Editors' Introduction: Parallelism on the Desktop (2011) (6)
- Necessary and sufficient conditions for hazard-free robust transistor stuck-open-fault testability in multilevel networks (1992) (5)
- Designing a Sub-RISC Multi-Gigabit Regular Expression Processor (2006) (5)
- Static Crosstalk-Noise Analysis - For Deep Sub-Micron Digital Designs (2004) (5)
- Reservoir Transformer (2020) (5)
- Chip-Level Assembly ( And Not the Integration of Synthesis and Physical ) Is the Key to DSM Design (1999) (5)
- A Disciplined Approach to the Development of Architectural Platforms (2002) (5)
- Multi-source Few-shot Domain Adaptation (2021) (5)
- Hardware-software co- design (1993) (5)
- FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge (2020) (5)
- Coverage-Directed Generation of Biased Random Inputs for Functional Validation of Sequential Circuits (2001) (5)
- EDA (2020) (4)
- Occom: Eecient Computation of Observability-based Code Coverage Metrics for Functional Veriication (1998) (4)
- Bright future for programmable processors (2001) (4)
- Open-Vocabulary 3D Detection via Image-level Class and Debiased Cross-modal Contrastive Learning (2022) (4)
- Technical Perspective: If I could only design one circuit … (2016) (4)
- Quantifying the Energy Efficiency of Object Recognition and Optical Flow (2014) (4)
- Namsel: An Optical Character Recognition System for Tibetan Text (2016) (4)
- ACM Transactions on Design Automation of Electronic Systems (TODAES) special section call for papers: Parallel CAD: Algorithm design and programming (2009) (4)
- Domain-Aware Dynamic Networks (2019) (4)
- A Metaprogramming and Autotuning Framework for Deploying Deep Learning Applications (2016) (4)
- Speeding up ImageNet Training on Supercomputers (2018) (4)
- DenseNet : Implementing Efficient ConvNet Descriptor Pyramids Technical Report (2014) (4)
- Improved Time-Resolved, 3D Phase Contrast Imaging through Variable Poisson Sampling and Partial Respiratory Triggering (2011) (4)
- The Need For Formal Verification In Hardware Design And What Formal Verification Has Not Done For Me Lately (1991) (4)
- ImageNet Training by CPU: AlexNet in 11 Minutes and ResNet-50 in 48 Minutes (2017) (3)
- On Properties of Algebraic Transformations and the Multifault Testablity of Multilevel Logic t (1989) (3)
- Evaluating the Effectiveness of Statistical Gate Sizing for Power Optimization (2005) (3)
- The Parallel Computing Laboratory at U . C . (2008) (3)
- Multitask Vision-Language Prompt Tuning (2022) (3)
- Why is Combinational ATPG Efficiently Solvable for Practical VLSI Circuits? (2001) (3)
- An algorithmic approach to optimizing fault coverage for BIST logic synthesis (1998) (3)
- PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map (2022) (3)
- Successfully Deploying the ASIP (2005) (3)
- Hardware/software codesign for mobile speech recognition (2013) (3)
- Scene-aware Learning Network for Radar Object Detection (2021) (3)
- Switch-Level Tools (1989) (3)
- Fast LSTM by dynamic decomposition on cloud and distributed systems (2020) (3)
- Panel: The ESDA Landscape: Who Will Dominate? (1995) (3)
- A Special Section on Multicore Parallel CAD: Algorithm Design and Programming (2011) (3)
- PDANet (2019) (3)
- Unsupervised Domain Adaptation: from Simulation Engine to the RealWorld (2018) (3)
- Technology Mapping (2008) (3)
- If I could only design one circuit ...: technical perspective (2016) (3)
- A parallel region based object recognition system (2011) (2)
- LEAP: Learnable Pruning for Transformer-based Models (2021) (2)
- nuFFTW: A Parallel Auto-Tuning Library for Performance Optimization of the nuFFT (2012) (2)
- Cross-Domain Sentiment Classification with In-Domain Contrastive Learning (2020) (2)
- CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification (2022) (2)
- Power Minimization with Multiple Supply Voltages and Multiple Threshold Voltages (2003) (2)
- Improving Context-Based Meta-Reinforcement Learning with Self-Supervised Trajectory Contrastive Learning (2021) (2)
- Chapter 13 – Sub-RISC Processors (2007) (2)
- High-Speed Logic, Circuits, Libraries and Layout (2004) (2)
- What’s Hidden in a One-layer Randomly Weighted Transformer? (2021) (2)
- Chapter 4 Introduction to Parallelizing Compressed Sensing Magnetic Resonance Imaging (2013) (2)
- Convolutional Monte Carlo Rollouts in Go (2015) (2)
- An Automatic Speech Recognition Application Framework for Highly Parallel Implementations on the GPU (2012) (2)
- Implicit enumeration techniques applied to asynchronous circuit verification (1993) (2)
- Recent progress in synthesis for testability (1991) (2)
- Synetgy (2019) (2)
- Differentiable NAS Framework and Application to Ads CTR Prediction (2021) (2)
- NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers (2022) (1)
- Evaluating Architectures for Application-Specific Parallel Scientific Computing Systems (2008) (1)
- CoDeNet: Algorithm-hardware Co-design for Deformable Convolution (2020) (1)
- Measuring the gap between programmable and fixed-function accelerators: A case study on speech recognition (2013) (1)
- Tipi: tiny instruction processors and interconnect (2005) (1)
- Roundtable: Machine Learning for Embedded Systems: Hype or Lasting Impact? (2018) (1)
- Annotation-Efficient Untrimmed Video Action Recognition (2020) (1)
- L ARGE B ATCH O PTIMIZATION FOR D EEP L EARNING : T RAINING BERT IN 76 MINUTES (2020) (1)
- Fast Cycle-Accurate Simulation and Instruction Set Generation for Constraint-Based Descriptions of Programmable Architectures (2004) (1)
- Trace Weighted Hessian-Aware Quantization (2019) (1)
- Introduction and Overview of the Book (2004) (1)
- Big Little Transformer Decoder (2023) (1)
- Soft multiprocessor systems for network applications (abstract only) (2005) (1)
- PALLAS: Mapping Applications onto Manycore (2011) (1)
- Domain-Adaptive Text Classification with Structured Knowledge from Unlabeled Data (2022) (1)
- A Processing Element and Programming Methodology for Click Elements (2005) (1)
- Cell libraries—build vs. buy; static vs. dynamic (panel) (1999) (1)
- Applying Text Analytics to the Mind-section Literature of the Tibetan Tradition of the Great Perfection (2020) (1)
- Chapter 3 MAPPING CONCURRENT APPLICATIONS ONTO ARCHITECTURAL PLATFORMS (2002) (1)
- Architecting parallel programs (2008) (1)
- Fast LSTM Inference by Dynamic Decomposition on Cloud Systems (2019) (1)
- Three Fingered Jack: Tackling Portability, Performance, and Productivity with Auto-Parallelized Python (2013) (1)
- Prototype-Voxel Contrastive Learning for LiDAR Point Cloud Panoptic Segmentation (2022) (1)
- Timing Analysis of Combinational Logic Circuits (1993) (1)
- Full Stack Optimization of Transformer Inference: a Survey (2023) (0)
- Hardware/software co-verification (panel) (1997) (0)
- Summary and Trends (1989) (0)
- Chapter 6 Introduction to Speech and Multimedia Applications in the Par Lab (2013) (0)
- Driving dataset in the wild : Various driving scenes by Byung Gon (2016) (0)
- Improving Performance through Microarchitecture (2004) (0)
- Roundtable: Machine learning for embedded systems (2018) (0)
- Is statistical timing statistically significant? (2004) (0)
- Fast Cycle-Accurate Simulation and instruction of Programmable rch itectu res Set Generation for Constraint- ased Descriptions (2004) (0)
- Ubiquitous Parallel Computing the Parlab at Berkeley, Upcrc-illinois, and the Pervasive Parallel Laboratory at Stanford Are Studying How to Make Parallel Programming Succeed given Industry's Recent Shift to Multicore Computing. All Three Centers Assume That Future Microprocessors Will Have Hundreds (2011) (0)
- Chapter 8 Introduction to Design Patterns for Parallel Computing (2013) (0)
- Embedded systems panel: HW and SW in embedded system design: loveboat, shipwreck, or ships passing in the night (1999) (0)
- Judiciously Using Benchmarking (2005) (0)
- that scale with increasing numbers of cores should be as easy as writing programs for sequential computers (2018) (0)
- SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection (2023) (0)
- EDA: this is serious business (2004) (0)
- 4.1 Example Computation (2007) (0)
- What is the next big productivity boost for designers? (panel) (1993) (0)
- 7. CONCLUSION (2002) (0)
- Design Flows (2018) (0)
- Inclusively Identifying the Architectural Space (2005) (0)
- Masked Layer Distillation: Fast and Robust Training Through Knowledge Transfer Normalization (2021) (0)
- Prior Knowledge-Guided Attention in Self-Supervised Vision Transformers (2022) (0)
- The next-generation HDL (panel) (1997) (0)
- Megatrends and EDA 2017 (2007) (0)
- Impact and Evaluation of Competing Implementation Media for ASIC's (Panel Abstract) (1990) (0)
- Part 1 : BDD Project Report 1 . 1 : Design Space Exploration for Deep Neural Nets for Advanced Driver Assistance Systems 1 . 2 : Principal Investigator Information Professor (2016) (0)
- A LJiiifiecl Approa.ch to the Synthesis of Fully Testable Scc1ueiitia.l Rilacliiiies (1990) (0)
- Unified tools for SoC embedded systems: mission critical, mission impossible or mission irrelevant? (2002) (0)
- Study of Entity Detection and Identification using Deep Learning Techniques a Survey (2020) (0)
- Quadric Representations for LiDAR Odometry, Mapping and Localization (2023) (0)
- Knowledge-Guided Self-Supervised Vision Transformers for Medical Imaging (2022) (0)
- UnrealNAS: Can We Search Neural Architectures with Unreal Data? (2022) (0)
- Session details: Panel (2008) (0)
- Framework Design Pattern Pattern Language (2013) (0)
- Scaling Vision-Language Models with Sparse Mixture of Experts (2023) (0)
- Foundations of Hybrid and Embedded Software Systems (2003) (0)
- Framework Design Pattern Pattern Language (2013) (0)
- Automatic Layout of Switch-Level Designs (1989) (0)
- Bridging the Performance Gap between Manual and Automatic Compilers with Intent-based Compilation (2015) (0)
- Linear Programming for Gate Sizing (2007) (0)
- Session details: What happened to ASIC? go (recon) figure? (2004) (0)
- Boda (2020) (0)
- Finding Peak Performance in a Process (2004) (0)
- Open-Vocabulary Point-Cloud Object Detection without 3D Annotation (2023) (0)
- Mobile Communications: Demands on VLSI Technology, Design and CAD (1996) (0)
- Accelerating Speech Recognition on Mulicore and Manycore Platforms (2009) (0)
- Linear Programming for Multi-Vth and Multi-Vdd Assignment (2007) (0)
- Chapter 3 The Content-Based Image Retrieval Project (2013) (0)
- The IMAGES Language (1989) (0)
- Pipelining to Reduce the Power (2007) (0)
- Scaling Up Machine Learning: Scalable Parallelization of Automatic Speech Recognition (2011) (0)
- Achieving 550MHz in a Standard Cell ASIC Methodology (2004) (0)
- Supplementary Material for Automatic Augmentation Policies for Self-Supervised Learning (2021) (0)
- Proceedings of the 1995 International Symposium on Low Power Design 1995, Dana Point, California, USA, April 23-26, 1995 (1995) (0)
- The Content-Based Image Retrieval Project (2013) (0)
- Chapter 4 PALLAS : Mapping Applications onto Manycore (2010) (0)
- CEO Panel: EDA: This is serious business (2004) (0)
- Session details: Megatrends and EDA 2017 (2007) (0)
- Analysis of Quantization on MLP-based Vision Models (2022) (0)
- Yet No Bigger than Your Thumb : The Tshon Gang in Bon Dzogchen (2018) (0)
- Addendum to "Synthesis of robust delay-fault testable circuits: Theory" (1996) (0)
- TEXTURING LONG PLANAR SURFACES WITH IMPRECISE CAMERA POSES FOR INDOOR 3 D MODELING (2012) (0)
- Enabling Technology For More Pervasive And Responsive Market Risk Management Systems (2011) (0)
- HW and SW in embedded system design: loveboat, shipwreck, or ships passing in the night (1999) (0)
- Register Transfer Level Synthesis: From Theory to Practice (1996) (0)
- Enhancements to the IMAGES Language for Synthesis (1989) (0)
This paper list is powered by the following services:
Other Resources About Kurt Keutzer
What Schools Are Affiliated With Kurt Keutzer?
Kurt Keutzer is affiliated with the following schools: