Alexandru Nicolau

Alexandru Nicolau's AcademicInfluence.com Rankings

Alexandru Nicolau

Computer Science

#7052

World Rank

#7427

Historical Rank

Computer Architecture

#43

World Rank

#45

Historical Rank

Database

#4112

World Rank

#4276

Historical Rank

computer-science Degrees

Download Badge

Computer Science

Why Is Alexandru Nicolau Influential?

(Suggest an Edit or Addition)

(See a Problem?)

Alexandru Nicolau's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

EXPRESSION: a language for architecture exploration through compiler/simulator retargetability (1999) (439)
SPARK: a high-level synthesis framework for applying parallelizing compiler transformations (2003) (429)
Automatic program parallelization (1993) (340)
Efficient utilization of scratch-pad memory in embedded processor applications (1997) (294)
Optimal loop parallelization (1988) (286)
On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems (2000) (247)
Parallelizing Programs with Recursive Data Structures (1989) (233)
A configurable simulation environment for the efficient simulation of large-scale spiking neural networks on graphics processors (2009) (209)
Measuring the Parallelism Available for Very Long Instruction Word Architectures (1984) (186)
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration (1998) (179)
Profile-based dynamic voltage scheduling using program checkpoints (2002) (177)
Partitioned Register Files For VLIWs: A Preliminary Analysis Of Tradeoffs (1992) (172)
Perfect Pipelining: A New Loop Parallelization Technique (1988) (155)
Underdesigned and Opportunistic Computing in Presence of Hardware Variability (2013) (152)
Memory Issues in Embedded Systems-on-Chip (1999) (152)
Adapting cache line size to application behavior (1999) (140)
Percolation based synthesis (1990) (138)
Percolation Scheduling: A Parallel Compilation Technique (1985) (134)
Integrated power management for video streaming to mobile handheld devices (2003) (133)
Adaptive Bitonic Sorting: An Optimal Parallel Algorithm for Shared-Memory Machines (1989) (131)
Run-Time Disambiguation: Coping with Statically Unpredictable Dependencies (1989) (113)
Abstractions for recursive pointer data structures: improving the analysis and transformation of imperative programs (1992) (112)
Augmenting Loop Tiling with Data Alignment for Improved Cache Performance (1999) (108)
Efficient simulation of large-scale Spiking Neural Networks using CUDA graphics processors (2009) (107)
A general data dependence test for dynamic, pointer-based data structures (1994) (107)
Advances in languages and compilers for parallel processing (1991) (106)
Coordinated parallelizing compiler optimizations and high-level synthesis (2004) (105)
A global resource-constrained parallelization technique (1989) (97)
A Development Environment for Horizontal Microcode (1986) (87)
Uniform Parallelism Exploitation in Ordinary Programs (1985) (82)
Memory data organization for improved cache performance in embedded processor applications (1997) (80)
Local memory exploration and optimization in embedded systems (1999) (79)
Architectural exploration and optimization of local memory in embedded systems (1997) (77)
Comparison of Compacting Algorithms for Garbage Collection (1983) (75)
Incremental tree height reduction for high level synthesis (1991) (73)
Resource-Constrained Software Pipelining (1995) (71)
DYNAMO: A Cross-Layer Framework for End-to-End QoS and Energy Optimization in Mobile Handheld Devices (2007) (70)
Data Memory Organization and Optimizations in Application-Specific Systems (2001) (68)
Using global code motions to improve the quality of results for high-level synthesis (2004) (68)
Network topology exploration of mesh-based coarse-grain reconfigurable architectures (2004) (66)
Trailblazing: A Hierarchical Approach to Percolation Scheduling (1993) (63)
Parallel processing: a smart compiler and a dumb machine (1984) (62)
A cross-layer approach for power-performance optimization in distributed mobile systems (2005) (59)
Memory aware compilation through accurate timing extraction (2000) (59)
Power savings in embedded processors through decode filter cache (2002) (56)
Memory organization for improved data cache performance in embedded processors (1996) (56)
R-Kleene: A High-Performance Divide-and-Conquer Algorithm for the All-Pair Shortest Path for Densely Connected Networks (2007) (54)
Functional abstraction driven design space exploration of heterogeneous programmable architectures (2001) (53)
On the performance potential of different types of speculative thread-level parallelism: The DL version of this paper includes corrections that were not made available in the printed proceedings (2006) (53)
V-SAT: a visual specification and analysis tool for system-on-chip exploration (2001) (52)
Design of a predictive filter cache for energy savings in high performance processor architectures (2001) (51)
Annotating the Java Bytecodes in Support of Optimization (1997) (51)
An efficient compiler technique for code size reduction using reduced bit-width ISAs (2002) (50)
RTGEN: an algorithm for automatic generation of reservation tables from architectural descriptions (1999) (50)
Java annotation-aware just-in-time (AJIT) complilation system (1999) (50)
Loop shifting and compaction for the high-level synthesis of designs with complex control flow (2004) (50)
Bypass aware instruction scheduling for register file power reduction (2006) (49)
Architecture Description Languages for Systems-on-Chip Design (1999) (44)
Optimal register assignment to loops for embedded code generation (1996) (43)
Access pattern based local memory customization for low power embedded systems (2001) (43)
Parallel processing: a smart compiler and a dumb machine (2004) (43)
Architectural and compiler strategies for dynamic power management in the COPPER project (2001) (41)
Exploiting off-chip memory access modes in high-level synthesis (1997) (39)
EXPRESSION: An ADL for system level design exploration (1998) (38)
CyberPhysical-System-On-Chip (CPSoC): A self-aware MPSoC paradigm with cross-layer virtual sensing and actuation (2015) (38)
Loop Quantization: an Analysis and Algorithm (1987) (37)
Speculation techniques for high level synthesis of control intensive designs (2001) (37)
The Strict Time Lower Bound and Optimal Schedules for Parallel Prefix with Resource Constraints (1996) (36)
Power / capacity scaling: Energy savings with simple fault-tolerant caches (2014) (36)
APEX: access pattern based memory architecture exploration (2001) (35)
Mutation Scheduling: A Unified Approach to Compiling for Fine-Grain Parallelism (1994) (34)
Comparative architectural characterization of SPEC CPU2000 and CPU2006 benchmarks on the intel® Core™ 2 Duo processor (2008) (34)
Automatic verification of in-order execution in microprocessors with fragmented pipelines and multicycle functional units (2002) (34)
SmartBalance: A sensing-driven linux load balancer for energy efficiency of heterogeneous MPSoCs (2015) (34)
Languages and Compilers for Parallel Computing (1993) (33)
Data cache sizing for embedded processor applications (1998) (33)
VaMV: Variability-aware Memory Virtualization (2012) (33)
Reducing data cache energy consumption via cached load/store queue (2003) (33)
Loop Quantization: A Generalized Loop Unwinding Technique (1988) (33)
Tight analysis of the performance potential of thread speculation using spec CPU 2006 (2007) (33)
Register Allocation, Renaming and Their Impact on Fine-Grain Parallelism (1991) (32)
Loop Quantization or Unwinding Done Right (1987) (32)
Reducing power consumption for high-associativity data caches in embedded processors (2003) (32)
Processor-memory co-exploration driven by a Memory-Aware Architecture Description Language (2001) (31)
Abstract description of pointer data structures: an approach for improving the analysis and optimization of imperative programs (1992) (31)
Energy efficient watermarking on mobile devices using proxy-based partitioning (2006) (31)
A language for conveying the aliasing properties of dynamic, pointer-based data structures (1994) (31)
A Mapping Strategy for MIMD Computers (1993) (31)
A customizable compiler framework for embedded systems (2001) (30)
MIST: an algorithm for memory miss traffic management (2000) (28)
Parallelism, memory anti-aliasing and correctness for trace scheduling compilers (disambiguation, flow-analysis, compaction) (1984) (28)
Towards parallelizing the layout engine of firefox (2010) (27)
Challenges in exploitation of loop parallelism in embedded applications (2006) (27)
ViPZonE: OS-level memory variability-driven physical address zoning for energy savings (2012) (27)
Fault tolerance in super-scalar and vliw processors (1991) (27)
Exploiting parallelism in matrix-computation kernels for symmetric multiprocessor systems: Matrix-multiplication and matrix-addition algorithm optimizations by software pipelining and threads allocation (2011) (27)
Memory Architecture Exploration for Programmable Embedded Systems (2002) (26)
Automatic modeling and validation of pipeline specifications driven by an architecture description language [SoC] (2002) (26)
WebRTCbench: a benchmark for performance assessment of webRTC implementations (2015) (25)
Adaptive Strassen's matrix multiplication (2007) (25)
OpenCV.js: computer vision processing for the open web platform (2018) (25)
Optimal schedules for parallel prefix computation with bounded resources (1991) (25)
Elimination of redundant memory traffic in high-level synthesis (1996) (25)
On-chip self-awareness using Cyberphysical-Systems-on-Chip (CPSoC) (2014) (25)
Incorporating DRAM access modes into high-level synthesis (1998) (24)
Operation tables for scheduling in the presence of incomplete bypassing (2004) (24)
The Design of the PROMIS Compiler (1999) (24)
Conditional speculation and its effects on performance and area for high-level synthesis (2001) (23)
Dynamically increasing the scope of code motions during the high-level synthesis of digital circuits (2003) (23)
A Unified code generation approach using mutation scheduling (1994) (23)
Using an oracle to measure potential parallelism in single instruction stream programs (1981) (23)
Improving cache Performance Through Tiling and Data Alignment (1997) (22)
Adaptive Winograd's matrix multiplications (2009) (22)
Multi-layer memory resiliency (2014) (21)
Dynamic common sub-expression elimination during scheduling in high-level synthesis (2002) (21)
Abstractions for Recursive Pointer Data Structures: Improving the Analysis of Imperative Programs (1992) (21)
A data alignment technique for improving cache performance (1997) (20)
History-aware Self-Scheduling (2006) (20)
Automatic software toolkit generation for embedded systems-on-chip (1999) (20)
Interface synthesis using memory mapping for an FPGA platform (2003) (20)
Integrating Program Transformations In The Memory-based Synthesis Of Image And Video Algorithms (1994) (20)
FORGE: a framework for optimization of distributed embedded systems software (2003) (20)
Performance evaluation for application-specific architectures (1995) (20)
Intererence analysis tools for parallelizing programs with recursive data structures (1989) (19)
PBExplore: a framework for compiler-in-the-loop exploration of partial bypassing in embedded processors (2005) (18)
Advanced Environments, Tools, and Applications for Cluster Computing (2002) (17)
Compilation framework for code size reduction using reduced bit-width ISAs (rISAs) (2006) (17)
Fractal Matrix Multiplication: A Case Study on Portability of Cache Performance (2001) (17)
CAMFAS: A Compiler Approach to Mitigate Fault Attacks via Enhanced SIMDization (2017) (17)
Achieving Multi-level Parallelization (1997) (16)
Static Scheduling for Dynamic Dataflow Machines (1990) (16)
Resource Directed Loop Pipelining: Exposing Just Enough Parallelism (1997) (16)
A Simple Mechanism for Improving the Accuracy and Efficiency of Instruction-Level Disambiguation (1995) (16)
PBPAIR: an energy-efficient error-resilient encoding using probability based power aware intra refresh (2006) (16)
High-Level synthesis with Synchronous and RAMBUS DRAMs (1998) (15)
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing (1991) (15)
Interconnect-Aware Mapping of Applications to Coarse-Grain Reconfigurable Architectures (2004) (15)
Memory system connectivity exploration (2002) (15)
An empirical study of the effect of source-level loop transformations on compiler stability (2018) (15)
Coordinated transformations for high-level synthesis of high performance microprocessor blocks (2002) (15)
SIMD-based soft error detection (2016) (15)
Realistic scheduling: compaction for pipelined architectures (1990) (15)
Register File Power Reduction Using Bypass Sensitive Compiler (2008) (15)
Aggregating processor free time for energy reduction (2005) (15)
Using Recursion to Boost ATLAS's Performance (2005) (14)
NSF expedition on variability-aware software: Recent results and contributions (2015) (14)
Design considerations for limited connectivity vliw architectures (1992) (14)
A Geometric Approach for Partitioning N-Dimensional Non-rectangular Iteration Spaces (2004) (14)
Adaptive Strassen and ATLAS's DGEMM: a fast square-matrix multiply for modern high-performance systems (2005) (14)
Minimization of Memory Traffic in High-Level Synthesis (1994) (14)
Techniques for efficient placement of synchronization primitives (2009) (14)
Synchronization optimizations for efficient execution on multi-cores (2009) (13)
Using profiling to reduce branch misprediction costs on a dynamically scheduled processor (2000) (13)
Incorporating compiler feedback into the design of ASIPs (1995) (12)
Access pattern-based memory and connectivity architecture exploration (2003) (12)
Acceleration Framework for FPGA Implementation of OpenVX Graph Pipelines (2018) (12)
A Fine-Grain Parallelizing Compiler (1986) (12)
Dynamic conditional branch balancing during the high-level synthesis of control-intensive designs (2003) (12)
VISTA: The Visual Interface for Scheduling Transformations and Analysis (1993) (12)
The PROMIS compiler prototype (1997) (11)
New directions in compiler technology for embedded systems (2001) (11)
Architecture description language driven design space exploration in the presence of coprocessors (2001) (11)
High performance annotation-aware JVM for Java cards (2005) (11)
A design space exploration framework for reduced bit-width Instruction Set architecture (rISA) design (2002) (11)
A performance evaluator for parameterized ASIC architectures (1994) (11)
AVid: Annotation driven video decoding for hybrid memories (2012) (11)
Efficient hardware for multiway jumps and pre-fetches (1985) (11)
DPCS: Dynamic Power/Capacity Scaling for SRAM Caches in the Nanoscale Era (2015) (11)
Variability-aware memory management for nanoscale computing (2013) (11)
An Efficient Load Balancing Scheme for Grid-based High Performance Scientific Computing (2005) (11)
On the exploitation of loop-level parallelism in embedded applications (2009) (11)
On the Determination of Inlining Vectors for Program Optimization (2013) (10)
Compiler-Directed Cache Line Size Adaptivity (2000) (10)
Lightweight lock-free synchronization methods for multithreading (2006) (10)
Automatic validation of pipeline specifications (2001) (10)
Software Annotations for Power Optimization on Mobile Devices (2006) (10)
Loop Quantization: Unwinding for Fine-Grain Parallelism Exploitation (1985) (10)
Annotation Based Multimedia Streaming Over Wireless Networks (2006) (10)
LORE: A loop repository for the evaluation of compilers (2017) (10)
Computing Programs Containing Band Linear Recurrences on Vector Supercomputers (1996) (10)
Pruning hardware evaluation space via correlation-driven application similarity analysis (2011) (10)
Expression equivalence checking using interval analysis (2006) (10)
OpenCV . js : Computer Vision Processing for the Web (2017) (10)
An annotation-aware Java virtual machine implementation (2000) (9)
Equivalence checking of arithmetic expressions using fast evaluation (2005) (9)
Retargetable pipeline hazard detection for partially bypassed processors (2006) (9)
Speedup of band linear recurrences in the presence of resource constraints (1992) (9)
A fault tolerant self-scheduling scheme for parallel loops on shared memory systems (2012) (9)
AFFIX: Automatic Acceleration Framework for FPGA Implementation of OpenVX Vision Algorithms (2019) (9)
A Framework for Data Dependence Testing in the Presence of Pointers (1994) (9)
Automatic Design Space Exploration of Register Bypasses in Embedded Processors (2007) (9)
A Percolation Based VLIW Architecture (1991) (9)
Optimal register assignment to loops for embedded code generation (1995) (9)
Teaching Parallel Computing and Dependence Analysis with Python (2019) (8)
Reducing power with an L0 instruction cache using history-based prediction (2002) (8)
Concurrent Information Processing and Computing (2005) (8)
Efficient Scheduling of Nested Parallel Loops on Multi-Core Systems (2009) (8)
A simplified java bytecode compilation system for resource-constrained embedded processors (2007) (8)
Data-rate-aware FPGA-based acceleration framework for streaming applications (2016) (8)
An Efficient Global Resource Constrained Technique for Exploiting Instruction Level Parallelism (1992) (8)
A novel approach for partitioning iteration spaces with variable densities (2005) (8)
Exploitation of nested thread-level speculative parallelism on multi-core systems (2010) (8)
Large-scale neural circuit mapping data analysis accelerated with the graphical processing unit (GPU) (2015) (8)
Software fault tolerance for FPUs via vectorization (2015) (7)
Automatic generation of operation tables for fast exploration of bypasses in embedded processors (2006) (7)
Integrated I-cache Way Predictor and Branch Target Buffer to Reduce Energy Consumption (2009) (7)
Accelerating Brain Circuit Simulations of Object Recognition with a Sony PlayStation 3 (2007) (7)
Proceedings of the 16th international conference on Supercomputing (2002) (7)
ViPZonE: Hardware Power Variability-Aware Virtual Memory Management for Energy Savings (2015) (7)
A development environment for scientific parallel programs (1986) (7)
Managing Cross-Layer Constraints for Interactive Mobile Multimedia∗ (2003) (7)
An environment for the development of microcode for pipelined architectures (1992) (6)
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming (2013) (6)
An efficient, global resource-directed approach to exploiting instruction-level parallelism (1996) (6)
Compiler-Directed Cache Assist Adaptivity (2000) (6)
Caching values in the load store queue (2004) (6)
Energy analysis of multimedia watermarking on mobile handheld devices (2005) (6)
Optimizing Program Performance via Similarity, Using a Feature-Agnostic Approach (2013) (6)
A general approach for partitioning N-dimensional parallel nested loops with conditionals (2006) (6)
Resource-Directed Loop Pipelining (1996) (6)
Parallel Language and Compiler Research in Japan (1995) (6)
Short-Circuit Compiler Transformation: Optimizing Conditional Blocks (2007) (6)
High-Level Synthesis with SDRAMs and RAMBUS DRAMs (Special Section on VLSI Design and CAD Algorithms) (1999) (6)
Simultaneous Way footprint Prediction and Branch Prediction for Energy Savings in Set associative Instruction Caches (2001) (6)
An Approach to Combine Predicated/Speculative Execution for Programs with Unpredictable Branches (1994) (6)
An Efficient Approach for Self-scheduling Parallel Loops on Multiprogrammed Parallel Computers (2005) (6)
Percolation scheduling for non-VLIW machines (1990) (6)
A method for register allocation to loops in multiple register file architectures (1996) (6)
New directions in compiler technology for embedded systems (embedded tutorial) (2001) (6)
Memory Subsystem Description in EXPRESSION (2000) (6)
Selected papers of the second workshop on Languages and compilers for parallel computing (1990) (6)
Harmonic scheduling of linear recurrences for digital filter design (1992) (6)
Control flow optimization in loops using interval analysis (2008) (5)
A hypergraph-based model for port allocation on multiple-register-file VLIW architectures (1995) (5)
Applying an Abstract Data Structure Description Approach to Parallelizing Scientific Pointer Programs (1992) (5)
Using Hardware Counters to Predict Vectorization (2017) (5)
Impact of JVM superoperators on energy consumption in resource-constrained embedded systems (2008) (5)
Customizing Software Toolkits for Embedded Systems-On-Chip (2000) (5)
Using a Way Cache to Improve Performance of Set-Associative Caches (2005) (5)
Proxy-based task partitioning of watermarking algorithms for reducing energy consumption in mobile devices (2004) (5)
Novel Brain-Derived Algorithms Scale Linearly with Number of Processing Elements (2007) (5)
Brain Derived Vision Algorithm on High Performance Architectures (2009) (5)
Cache-aware partitioning of multi-dimensional iteration spaces (2009) (5)
Cache-aware iteration space partitioning (2008) (5)
Enhanced Loop Coalescing: A Compiler Technique for Transforming Non-uniform Iteration Spaces (2005) (5)
Content-aware Power Optimizations for Multimedia Streaming Over Wireless Networks (2006) (5)
Annotation-aware dynamic compilation and interpretation (2002) (5)
How Many Threads to Spawn during Program Multithreading? (2010) (4)
An annotation‐aware Java virtual machine implementation (2000) (4)
Instruction Level Parallelism (2016) (4)
Minimization of Memory Tra c in High-Level Synthesis (1995) (4)
Improving numerical accuracy for non-negative matrix multiplication on GPUs using recursive algorithms (2013) (4)
A predictive decode filter cache for reducing power consumption in embedded processors (2007) (4)
Using Data Dependence Analysis and Loop Transformations to Teach Vectorization (2017) (4)
The Design of the PROMIS Compiler—Towards Multi-Level Parallelization (2000) (4)
Video Stream Annotations for Energy Trade-offs in Multimedia Applications (2006) (4)
Parallelizing tightly nested loops (1991) (4)
High-Level Synthesis of Scalable Architectures for IIR Filters Using Multichip Modules (1993) (4)
Adaptive Line Size Cache (1999) (4)
Just in Time Load Balancing (2012) (4)
Miss Elimination by Time stride Prefetch (2000) (4)
Improving the Accuracy of High Performance BLAS Implementations Using Adaptive Blocked Algorithms (2011) (4)
Specification of Hazards, Stalls, Interrupts, and Exceptions in EXPRESSION (2001) (4)
Improving accuracy for matrix multiplications on GPUs (2011) (4)
Research Directions in Compiling For Massive Parallelism (1992) (3)
Partitioning of Variables for Multiple-Register-File Architectures via Hypergraph Coloring (1994) (3)
On the efficacy of call graph-level thread-level speculation (2010) (3)
ServiceFORGE : A Software Architecture for Power and Quality Aware Services (2003) (3)
Symbolic Analysis in the PROMIS Compiler (1999) (3)
Using Annotations to Facilitate Power vs Quality Trade-offs in Streaming Applications ∗ (2005) (3)
Partitioning of Variables for Multiple-Register-File VLIW Architectures (1994) (3)
Parallelizing Non-Vectorizable Loops for MIMD Machines (1990) (3)
A radiative transfer module for calculating photolysis rates and solar heating in climate models: Solar-J v7.5 (2017) (3)
Exploring scalable schedules for IIR filters with resource constraints (1999) (3)
A development environment for horizontal microcode programs (1986) (3)
A Data Cache with Dynamic Mapping (2003) (3)
Data dependence testing in the presence of pointers and pointer-based data structures (1998) (3)
A Hierarchical Parallelizing Compiler for VLIW/MIMD Machines (1992) (3)
DPCS (2015) (2)
EdgeAvatar: An Edge Computing System for Building Virtual Beings (2021) (2)
Fine grain software pipelining of non-vectorizable nested loops (1991) (2)
A compiler-driven supercomputer (1986) (2)
Cache with Adaptive Fetch Size (2000) (2)
N-dimensional perfect pipelining (1992) (2)
Automatic program parallelization : Languages and compilers (1993) (2)
Towards an Achievable Performance for the Loop Nests (2018) (2)
Getting High Performance with Slow Memory (1986) (2)
Optimizing control flow in loops using interval and dependence analysis (2009) (2)
Languages and Compilers for Parallel Computing: 7th International Workshop, Ithaca, NY, USA, August 8 - 10, 1994. Proceedings (1995) (2)
Analyzing the individual/combined effects of speculative and guarded execution on a superscalar architecture (1998) (2)
Scalable techniques for computing band linear recurrences on massively parallel and vector supercomputers (1994) (2)
Line size adaptivity analysis of parameterized loop nests for direct mapped data cache (2005) (2)
Percolation scheduling with resource constraints (1989) (2)
Languages and Compilers for Parallel Computing: 5th International Workshop, New Haven, Connecticut, Usa, August 3-5, 1992 : Proceedings (1993) (2)
Aggressive Memory-Aware Compilation (2000) (2)
Fine-grain compilation for pipelined machines (1988) (2)
Probability based power aware error resilient coding (2005) (2)
Comparative characterization of SPEC CPU2000 and CPU2006 on Itanium® architecture (2007) (2)
Accelerating Brain Circuit Simulations of Object Recognition with CELL Processors (2007) (2)
Polygonal Iteration Space Partitioning (2016) (2)
A hierarchical approach to instruction-level parallelization (1995) (2)
Microflow: A Fine-Grain Parallel Processing Approach (1985) (2)
Overview of ILP Architectures (2016) (2)
EXPRESSION User Manual version 1.0 (2003) (1)
Proceedings of the 23rd international conference on Supercomputing (2009) (1)
Harmonic Scheduling: A Technique for Scheduling beyond Loop-Carried Dependencies (1993) (1)
Software Pipelining by Kernel Recognition (2016) (1)
Pretty Good Accuracy in Matrix Multiplication with GPUs (2010) (1)
Selective search of inlining vectors for program optimization (2012) (1)
A Model-Based Approach to System Specification for Distributed Real-time and Embedded Systems * (2004) (1)
Dynamically adaptive fetch size prediction for data caches (2003) (1)
Detecting COVID-19 Related Pneumonia On CT Scans Using Hyperdimensional Computing (2021) (1)
Languages and Compilers for Parallel Computing: 6th International Workshop, Portland, Oregon, USA, August 12 - 14, 1993. Proceedings (1994) (1)
Languages and Compilers for Parallel Computing (2001) (1)
NumbaSummarizer: A Python Library for Simplified Vectorization Reports (2020) (1)
A Case for an Adaptive and Opportunistic Variability- Aware Memory Virtualization Layer (2011) (1)
A global resource-constrained parallelization technique (1989) (1)
JuliusC: A Practical Approach for the Analysis of Divide-and-Conquer Algorithms (2004) (1)
On-Chip vs. Off-Chip Memory: Utilizing Scratch-Pad Memory (1999) (1)
Modulo Scheduling and Loop Pipelining (2011) (1)
On the evaluation and extraction of thread-level parallelism in ordinary programs (2008) (1)
MCompiler: A Synergistic Compilation Framework (2019) (1)
PBPAIR : Probability Based Power Aware Intra Refresh A New Energy-efficient Error-resilient Encoding Scheme * (2005) (1)
Parallelization of programs containing loop-carried dependences with resource constraints (1994) (1)
Power-Aware Multimedia Streaming in Heterogeneous Multi-User Environments ∗ (2003) (1)
Effective Evaluation of Multi-core Based Systems (2013) (1)
Pruning Hardware Evaluation Space via Causality-Driven Application Similarity Analysis (2010) (1)
Path Collection and Dependence Testing in the Presence of Dynamic, Pointer-Based Data Structures (1995) (1)
Compiler-inthe-Loop ADL-driven Early Architectural Exploration ∗ (2005) (1)
AFFIX (2019) (1)
CECS TR Coversheet New (2015) (0)
1 A Framework for GUI-driven Design Space Exploration of a MIPS 4 K-like processor † (2003) (0)
Low-level programming for a massively parallel fine-grain computer: the Microflow approach (1987) (0)
The x-legion: a compiler-approach to exploit locality and portability of divide-and-conquer algorithms (2005) (0)
Falcon: a Matlab Interactive Restructuring Compiler Falcon: a Matlab Interactive Restructuring Compiler (1995) (0)
From the guest editors (2007) (0)
High-Level Synthesis of Scalable Architectures for IIR Filters Using Parameterized MCM ' s y Haigeng (1992) (0)
Copy Elimination for Parallelizing Compilers (1998) (0)
A new technique for induction variable removal (1991) (0)
Algorithms of All Pair Shortest Path Problem (2016) (0)
Register allocation issues in embedded code generation (1998) (0)
Editors' introduction (2007) (0)
New Opportunities for Compilers in Computer Security (2018) (0)
Performance Characterization of Itanium® 2-Based Montecito Processor (2009) (0)
Concurrent Information Processing and Computing (NATO Science) (2005) (0)
Architecture exploration of parameterizable EPIC SOS architectures (poster paper) (2000) (0)
Selective Guarded Execution Using Pro ling on a DynamicallyScheduled ProcessorSrinivas (2007) (0)
Compile time vs. runtime: scheduling parallelism on dataflow machines (1989) (0)
Notice : This Material may be protected by Copyright Law ( Title 17 U . S . C . ) HIERARCHICAL PARALLELISM EXPLOITATION (2015) (0)
A Heterogeneous Solution to the All-pairs Shortest Path Problem using FPGAs (2022) (0)
Editor's Announcement (1998) (0)
Memory Architecture Exploration (1999) (0)
Is computer science dying? (2016) (0)
Regular schedules for scalable design of IIR filters (1993) (0)
Software pipelining of non-vectorizable loosely nested loops (1991) (0)
Fault Tolerance for FPUs via Vectorization (2015) (0)
Massive Parallelism and Fine-Grain Parallelism: are They Incompatible? (1993) (0)
Scheduling Basic Blocks (2016) (0)
Disambiguation, Correctness and Flow-Analysis Issues for Trace Scheduling Compilers (1984) (0)
Hardware App 1 App 2 App N Cross-Layer Sensors ( Virtual & Physical ) Decisions & Learning ( Controller ) Actuation ( software and hardware ) (2014) (0)
Session details: Optimizing parallel applications (2009) (0)
Optimizing Program Performance via Similarity, Using Feature-aware and Feature-agnostic Characterization Approaches (2013) (0)
Leveraging profile-selected execution patterns for optimized code execution in resource-constrained systems (2010) (0)
Selective guarded execution using profiling on a dynamically scheduled processor (1999) (0)
Acknowledgment to Reviewers (1978) (0)
A Unified Power Management Framework for Distributed Video Streaming to Mobile Devices (2004) (0)
Comparative characterization of SPEC CPU2000 and CPU2006 on Itanium architecture (2007) (0)
MIMD Lattice Computation (2011) (0)
Author retrospective for a global resource-constrained parallelization technique (2014) (0)
How Do We Make Parallel Processing a Reality? Bridging the Gap Between Theory and Practice (1991) (0)
PBExplore : A Compiler-inthe-Loop Framework for Design Space Exploration of Partially Bypassed Processor Pipelines User Manual Version 1 . 0 01 / 15 / 2008 (2008) (0)
1 P Ports P Ports 2 P Portsk Mem 1 Mem 2 Mem k Functional Unit MUnit 1 Functional (0)
JJarmonic Scheduling of Linear Recurrences in Digital Filter Design ___ (2015) (0)
Architecture exploration of parameterizable EPIC SOC architectures (2000) (0)
Incremental tree height reduction for code compaction (1990) (0)
Multi-Layer Memory Resiliency Invited Paper in Special Session "Embedded Resiliency: Approaches for the Next Decade" (2014) (0)
If software is king for systems-on-silicon, what's new in compilers? (1997) (0)
Hierarchical parallelism exploitation (1989) (0)
Low Energy Associative Data Caches for Embedded Systems (2003) (0)
A Compilation and Run-Time Framework for Maximizing Performance of Self-scheduling Algorithms (2014) (0)
Operation Tables for Scheduling in the Presence of Incomplete Bypassing (2004) (0)
A Simplified Java Compilation System for Resource-Constrained Embedded Processors ∗ (2007) (0)
Proceedings of the 2016 International Conference on Supercomputing (2009) (0)
Load Balancing with Polygonal Partitions (2018) (0)
Welcome to ICS'02 (2002) (0)
Annotation Integration and Trade-off Analysis for Multimedia Applications (2007) (0)
Off-Chip Memory Access Optimizations (1999) (0)
W Operating System DVS Scheduler Network Management Transcoding Admission Control Applications Video Player Other Tasks Middleware (2003) (0)
Distributed Multimedia Streaming in a Heterogeneous Environment ∗ (2003) (0)
Branch optimizations and instruction-level parallelism exploitation for dynamic superscalar and vliw processors (2000) (0)
Case Study: MPEG Decoder (1999) (0)
Microprogramming research projects at Cornell University (1986) (0)
Proceedings of the NATO Advanced Research Workshop on Advanced Environments, Tools, and Applications for Cluster Computing-Revised Papers (2001) (0)
A Spill Code Minimization Algorithm for Loops* ----- (2015) (0)
Comparison of Compacting for Garbage Collection (1983) (0)
Memory Resiliency Invited Paper in Special Session Resiliency: Approaches for the Next Decade" (2014) (0)
The effects of predicated execution on architectures supporting dynamic speculation (1998) (0)
cpsoc-codes+isss2014-splsession (2015) (0)
COPPER: COMPILER-CONTROLLED ON-DEMAND APPROACH TO POWER-EFFICIENT COMPUTING (2003) (0)
Fetch Size Adaptation vs. Stream Buuer for Media Benchmarks 1 Contents 1 Introduction 1 2 Related Work 1 3 Impacts of Variable Fetch Sizes and Stream Buuer 2 4 Fetch Size Adaptation 6 (2001) (0)
A Systematic Approach to Branch Speculation (1997) (0)
Speculative Execution by Compiler Supported Branch Prediction Hardware (1996) (0)
Proceedings of the 16th international conference on Supercomputing, ICS 2002, New York City, NY, USA, June 22-26, 2002 (2002) (0)
ROPE: A New Twist in Computer Architectures (1987) (0)
Paradigm with Cross-Layer Virtual Sensors and Actuators (2013) (0)
Fine-grain parallelization versus the wavefront method (1989) (0)
Ultra fine-grain template-driven synthesis (1994) (0)
Data Organization: The Processor Core/Cache Interface (1999) (0)
Content annotation for power and quality trade-offs in mobile multimedia systems (2007) (0)
Fault Tolerant Scheduling for Parallel Loops on Shared Memory Systems (2015) (0)
Probablistic Self-Scheduling (2006) (0)
Fine-grain loop scheduling for MIMD machines (1990) (0)
Tutorial 2, parallel proessing : architecture and software : the 17th Annual International Symposium on Computer Architecture (1990) (0)

This paper list is powered by the following services:

What Schools Are Affiliated With Alexandru Nicolau?

Alexandru Nicolau is affiliated with the following schools:

Alexandru Nicolau's Academic­Influence.com Rankings

Why Is Alexandru Nicolau Influential?

Alexandru Nicolau's Published Works

Published Works

What Schools Are Affiliated With Alexandru Nicolau?

Alexandru Nicolau's AcademicInfluence.com Rankings