Alexandru Nicolau
#143,607
Most Influential Person Now
Alexandru Nicolau's AcademicInfluence.com Rankings
Alexandru Nicolaucomputer-science Degrees
Computer Science
#7052
World Rank
#7427
Historical Rank
Computer Architecture
#43
World Rank
#45
Historical Rank
Database
#4112
World Rank
#4276
Historical Rank

Download Badge
Computer Science
Why Is Alexandru Nicolau Influential?
(Suggest an Edit or Addition)Alexandru Nicolau's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
Published Works
- EXPRESSION: a language for architecture exploration through compiler/simulator retargetability (1999) (439)
- SPARK: a high-level synthesis framework for applying parallelizing compiler transformations (2003) (429)
- Automatic program parallelization (1993) (340)
- Efficient utilization of scratch-pad memory in embedded processor applications (1997) (294)
- Optimal loop parallelization (1988) (286)
- On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems (2000) (247)
- Parallelizing Programs with Recursive Data Structures (1989) (233)
- A configurable simulation environment for the efficient simulation of large-scale spiking neural networks on graphics processors (2009) (209)
- Measuring the Parallelism Available for Very Long Instruction Word Architectures (1984) (186)
- Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration (1998) (179)
- Profile-based dynamic voltage scheduling using program checkpoints (2002) (177)
- Partitioned Register Files For VLIWs: A Preliminary Analysis Of Tradeoffs (1992) (172)
- Perfect Pipelining: A New Loop Parallelization Technique (1988) (155)
- Underdesigned and Opportunistic Computing in Presence of Hardware Variability (2013) (152)
- Memory Issues in Embedded Systems-on-Chip (1999) (152)
- Adapting cache line size to application behavior (1999) (140)
- Percolation based synthesis (1990) (138)
- Percolation Scheduling: A Parallel Compilation Technique (1985) (134)
- Integrated power management for video streaming to mobile handheld devices (2003) (133)
- Adaptive Bitonic Sorting: An Optimal Parallel Algorithm for Shared-Memory Machines (1989) (131)
- Run-Time Disambiguation: Coping with Statically Unpredictable Dependencies (1989) (113)
- Abstractions for recursive pointer data structures: improving the analysis and transformation of imperative programs (1992) (112)
- Augmenting Loop Tiling with Data Alignment for Improved Cache Performance (1999) (108)
- Efficient simulation of large-scale Spiking Neural Networks using CUDA graphics processors (2009) (107)
- A general data dependence test for dynamic, pointer-based data structures (1994) (107)
- Advances in languages and compilers for parallel processing (1991) (106)
- Coordinated parallelizing compiler optimizations and high-level synthesis (2004) (105)
- A global resource-constrained parallelization technique (1989) (97)
- A Development Environment for Horizontal Microcode (1986) (87)
- Uniform Parallelism Exploitation in Ordinary Programs (1985) (82)
- Memory data organization for improved cache performance in embedded processor applications (1997) (80)
- Local memory exploration and optimization in embedded systems (1999) (79)
- Architectural exploration and optimization of local memory in embedded systems (1997) (77)
- Comparison of Compacting Algorithms for Garbage Collection (1983) (75)
- Incremental tree height reduction for high level synthesis (1991) (73)
- Resource-Constrained Software Pipelining (1995) (71)
- DYNAMO: A Cross-Layer Framework for End-to-End QoS and Energy Optimization in Mobile Handheld Devices (2007) (70)
- Data Memory Organization and Optimizations in Application-Specific Systems (2001) (68)
- Using global code motions to improve the quality of results for high-level synthesis (2004) (68)
- Network topology exploration of mesh-based coarse-grain reconfigurable architectures (2004) (66)
- Trailblazing: A Hierarchical Approach to Percolation Scheduling (1993) (63)
- Parallel processing: a smart compiler and a dumb machine (1984) (62)
- A cross-layer approach for power-performance optimization in distributed mobile systems (2005) (59)
- Memory aware compilation through accurate timing extraction (2000) (59)
- Power savings in embedded processors through decode filter cache (2002) (56)
- Memory organization for improved data cache performance in embedded processors (1996) (56)
- R-Kleene: A High-Performance Divide-and-Conquer Algorithm for the All-Pair Shortest Path for Densely Connected Networks (2007) (54)
- Functional abstraction driven design space exploration of heterogeneous programmable architectures (2001) (53)
- On the performance potential of different types of speculative thread-level parallelism: The DL version of this paper includes corrections that were not made available in the printed proceedings (2006) (53)
- V-SAT: a visual specification and analysis tool for system-on-chip exploration (2001) (52)
- Design of a predictive filter cache for energy savings in high performance processor architectures (2001) (51)
- Annotating the Java Bytecodes in Support of Optimization (1997) (51)
- An efficient compiler technique for code size reduction using reduced bit-width ISAs (2002) (50)
- RTGEN: an algorithm for automatic generation of reservation tables from architectural descriptions (1999) (50)
- Java annotation-aware just-in-time (AJIT) complilation system (1999) (50)
- Loop shifting and compaction for the high-level synthesis of designs with complex control flow (2004) (50)
- Bypass aware instruction scheduling for register file power reduction (2006) (49)
- Architecture Description Languages for Systems-on-Chip Design (1999) (44)
- Optimal register assignment to loops for embedded code generation (1996) (43)
- Access pattern based local memory customization for low power embedded systems (2001) (43)
- Parallel processing: a smart compiler and a dumb machine (2004) (43)
- Architectural and compiler strategies for dynamic power management in the COPPER project (2001) (41)
- Exploiting off-chip memory access modes in high-level synthesis (1997) (39)
- EXPRESSION: An ADL for system level design exploration (1998) (38)
- CyberPhysical-System-On-Chip (CPSoC): A self-aware MPSoC paradigm with cross-layer virtual sensing and actuation (2015) (38)
- Loop Quantization: an Analysis and Algorithm (1987) (37)
- Speculation techniques for high level synthesis of control intensive designs (2001) (37)
- The Strict Time Lower Bound and Optimal Schedules for Parallel Prefix with Resource Constraints (1996) (36)
- Power / capacity scaling: Energy savings with simple fault-tolerant caches (2014) (36)
- APEX: access pattern based memory architecture exploration (2001) (35)
- Mutation Scheduling: A Unified Approach to Compiling for Fine-Grain Parallelism (1994) (34)
- Comparative architectural characterization of SPEC CPU2000 and CPU2006 benchmarks on the intel® Core™ 2 Duo processor (2008) (34)
- Automatic verification of in-order execution in microprocessors with fragmented pipelines and multicycle functional units (2002) (34)
- SmartBalance: A sensing-driven linux load balancer for energy efficiency of heterogeneous MPSoCs (2015) (34)
- Languages and Compilers for Parallel Computing (1993) (33)
- Data cache sizing for embedded processor applications (1998) (33)
- VaMV: Variability-aware Memory Virtualization (2012) (33)
- Reducing data cache energy consumption via cached load/store queue (2003) (33)
- Loop Quantization: A Generalized Loop Unwinding Technique (1988) (33)
- Tight analysis of the performance potential of thread speculation using spec CPU 2006 (2007) (33)
- Register Allocation, Renaming and Their Impact on Fine-Grain Parallelism (1991) (32)
- Loop Quantization or Unwinding Done Right (1987) (32)
- Reducing power consumption for high-associativity data caches in embedded processors (2003) (32)
- Processor-memory co-exploration driven by a Memory-Aware Architecture Description Language (2001) (31)
- Abstract description of pointer data structures: an approach for improving the analysis and optimization of imperative programs (1992) (31)
- Energy efficient watermarking on mobile devices using proxy-based partitioning (2006) (31)
- A language for conveying the aliasing properties of dynamic, pointer-based data structures (1994) (31)
- A Mapping Strategy for MIMD Computers (1993) (31)
- A customizable compiler framework for embedded systems (2001) (30)
- MIST: an algorithm for memory miss traffic management (2000) (28)
- Parallelism, memory anti-aliasing and correctness for trace scheduling compilers (disambiguation, flow-analysis, compaction) (1984) (28)
- Towards parallelizing the layout engine of firefox (2010) (27)
- Challenges in exploitation of loop parallelism in embedded applications (2006) (27)
- ViPZonE: OS-level memory variability-driven physical address zoning for energy savings (2012) (27)
- Fault tolerance in super-scalar and vliw processors (1991) (27)
- Exploiting parallelism in matrix-computation kernels for symmetric multiprocessor systems: Matrix-multiplication and matrix-addition algorithm optimizations by software pipelining and threads allocation (2011) (27)
- Memory Architecture Exploration for Programmable Embedded Systems (2002) (26)
- Automatic modeling and validation of pipeline specifications driven by an architecture description language [SoC] (2002) (26)
- WebRTCbench: a benchmark for performance assessment of webRTC implementations (2015) (25)
- Adaptive Strassen's matrix multiplication (2007) (25)
- OpenCV.js: computer vision processing for the open web platform (2018) (25)
- Optimal schedules for parallel prefix computation with bounded resources (1991) (25)
- Elimination of redundant memory traffic in high-level synthesis (1996) (25)
- On-chip self-awareness using Cyberphysical-Systems-on-Chip (CPSoC) (2014) (25)
- Incorporating DRAM access modes into high-level synthesis (1998) (24)
- Operation tables for scheduling in the presence of incomplete bypassing (2004) (24)
- The Design of the PROMIS Compiler (1999) (24)
- Conditional speculation and its effects on performance and area for high-level synthesis (2001) (23)
- Dynamically increasing the scope of code motions during the high-level synthesis of digital circuits (2003) (23)
- A Unified code generation approach using mutation scheduling (1994) (23)
- Using an oracle to measure potential parallelism in single instruction stream programs (1981) (23)
- Improving cache Performance Through Tiling and Data Alignment (1997) (22)
- Adaptive Winograd's matrix multiplications (2009) (22)
- Multi-layer memory resiliency (2014) (21)
- Dynamic common sub-expression elimination during scheduling in high-level synthesis (2002) (21)
- Abstractions for Recursive Pointer Data Structures: Improving the Analysis of Imperative Programs (1992) (21)
- A data alignment technique for improving cache performance (1997) (20)
- History-aware Self-Scheduling (2006) (20)
- Automatic software toolkit generation for embedded systems-on-chip (1999) (20)
- Interface synthesis using memory mapping for an FPGA platform (2003) (20)
- Integrating Program Transformations In The Memory-based Synthesis Of Image And Video Algorithms (1994) (20)
- FORGE: a framework for optimization of distributed embedded systems software (2003) (20)
- Performance evaluation for application-specific architectures (1995) (20)
- Intererence analysis tools for parallelizing programs with recursive data structures (1989) (19)
- PBExplore: a framework for compiler-in-the-loop exploration of partial bypassing in embedded processors (2005) (18)
- Advanced Environments, Tools, and Applications for Cluster Computing (2002) (17)
- Compilation framework for code size reduction using reduced bit-width ISAs (rISAs) (2006) (17)
- Fractal Matrix Multiplication: A Case Study on Portability of Cache Performance (2001) (17)
- CAMFAS: A Compiler Approach to Mitigate Fault Attacks via Enhanced SIMDization (2017) (17)
- Achieving Multi-level Parallelization (1997) (16)
- Static Scheduling for Dynamic Dataflow Machines (1990) (16)
- Resource Directed Loop Pipelining: Exposing Just Enough Parallelism (1997) (16)
- A Simple Mechanism for Improving the Accuracy and Efficiency of Instruction-Level Disambiguation (1995) (16)
- PBPAIR: an energy-efficient error-resilient encoding using probability based power aware intra refresh (2006) (16)
- High-Level synthesis with Synchronous and RAMBUS DRAMs (1998) (15)
- Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing (1991) (15)
- Interconnect-Aware Mapping of Applications to Coarse-Grain Reconfigurable Architectures (2004) (15)
- Memory system connectivity exploration (2002) (15)
- An empirical study of the effect of source-level loop transformations on compiler stability (2018) (15)
- Coordinated transformations for high-level synthesis of high performance microprocessor blocks (2002) (15)
- SIMD-based soft error detection (2016) (15)
- Realistic scheduling: compaction for pipelined architectures (1990) (15)
- Register File Power Reduction Using Bypass Sensitive Compiler (2008) (15)
- Aggregating processor free time for energy reduction (2005) (15)
- Using Recursion to Boost ATLAS's Performance (2005) (14)
- NSF expedition on variability-aware software: Recent results and contributions (2015) (14)
- Design considerations for limited connectivity vliw architectures (1992) (14)
- A Geometric Approach for Partitioning N-Dimensional Non-rectangular Iteration Spaces (2004) (14)
- Adaptive Strassen and ATLAS's DGEMM: a fast square-matrix multiply for modern high-performance systems (2005) (14)
- Minimization of Memory Traffic in High-Level Synthesis (1994) (14)
- Techniques for efficient placement of synchronization primitives (2009) (14)
- Synchronization optimizations for efficient execution on multi-cores (2009) (13)
- Using profiling to reduce branch misprediction costs on a dynamically scheduled processor (2000) (13)
- Incorporating compiler feedback into the design of ASIPs (1995) (12)
- Access pattern-based memory and connectivity architecture exploration (2003) (12)
- Acceleration Framework for FPGA Implementation of OpenVX Graph Pipelines (2018) (12)
- A Fine-Grain Parallelizing Compiler (1986) (12)
- Dynamic conditional branch balancing during the high-level synthesis of control-intensive designs (2003) (12)
- VISTA: The Visual Interface for Scheduling Transformations and Analysis (1993) (12)
- The PROMIS compiler prototype (1997) (11)
- New directions in compiler technology for embedded systems (2001) (11)
- Architecture description language driven design space exploration in the presence of coprocessors (2001) (11)
- High performance annotation-aware JVM for Java cards (2005) (11)
- A design space exploration framework for reduced bit-width Instruction Set architecture (rISA) design (2002) (11)
- A performance evaluator for parameterized ASIC architectures (1994) (11)
- AVid: Annotation driven video decoding for hybrid memories (2012) (11)
- Efficient hardware for multiway jumps and pre-fetches (1985) (11)
- DPCS: Dynamic Power/Capacity Scaling for SRAM Caches in the Nanoscale Era (2015) (11)
- Variability-aware memory management for nanoscale computing (2013) (11)
- An Efficient Load Balancing Scheme for Grid-based High Performance Scientific Computing (2005) (11)
- On the exploitation of loop-level parallelism in embedded applications (2009) (11)
- On the Determination of Inlining Vectors for Program Optimization (2013) (10)
- Compiler-Directed Cache Line Size Adaptivity (2000) (10)
- Lightweight lock-free synchronization methods for multithreading (2006) (10)
- Automatic validation of pipeline specifications (2001) (10)
- Software Annotations for Power Optimization on Mobile Devices (2006) (10)
- Loop Quantization: Unwinding for Fine-Grain Parallelism Exploitation (1985) (10)
- Annotation Based Multimedia Streaming Over Wireless Networks (2006) (10)
- LORE: A loop repository for the evaluation of compilers (2017) (10)
- Computing Programs Containing Band Linear Recurrences on Vector Supercomputers (1996) (10)
- Pruning hardware evaluation space via correlation-driven application similarity analysis (2011) (10)
- Expression equivalence checking using interval analysis (2006) (10)
- OpenCV . js : Computer Vision Processing for the Web (2017) (10)
- An annotation-aware Java virtual machine implementation (2000) (9)
- Equivalence checking of arithmetic expressions using fast evaluation (2005) (9)
- Retargetable pipeline hazard detection for partially bypassed processors (2006) (9)
- Speedup of band linear recurrences in the presence of resource constraints (1992) (9)
- A fault tolerant self-scheduling scheme for parallel loops on shared memory systems (2012) (9)
- AFFIX: Automatic Acceleration Framework for FPGA Implementation of OpenVX Vision Algorithms (2019) (9)
- A Framework for Data Dependence Testing in the Presence of Pointers (1994) (9)
- Automatic Design Space Exploration of Register Bypasses in Embedded Processors (2007) (9)
- A Percolation Based VLIW Architecture (1991) (9)
- Optimal register assignment to loops for embedded code generation (1995) (9)
- Teaching Parallel Computing and Dependence Analysis with Python (2019) (8)
- Reducing power with an L0 instruction cache using history-based prediction (2002) (8)
- Concurrent Information Processing and Computing (2005) (8)
- Efficient Scheduling of Nested Parallel Loops on Multi-Core Systems (2009) (8)
- A simplified java bytecode compilation system for resource-constrained embedded processors (2007) (8)
- Data-rate-aware FPGA-based acceleration framework for streaming applications (2016) (8)
- An Efficient Global Resource Constrained Technique for Exploiting Instruction Level Parallelism (1992) (8)
- A novel approach for partitioning iteration spaces with variable densities (2005) (8)
- Exploitation of nested thread-level speculative parallelism on multi-core systems (2010) (8)
- Large-scale neural circuit mapping data analysis accelerated with the graphical processing unit (GPU) (2015) (8)
- Software fault tolerance for FPUs via vectorization (2015) (7)
- Automatic generation of operation tables for fast exploration of bypasses in embedded processors (2006) (7)
- Integrated I-cache Way Predictor and Branch Target Buffer to Reduce Energy Consumption (2009) (7)
- Accelerating Brain Circuit Simulations of Object Recognition with a Sony PlayStation 3 (2007) (7)
- Proceedings of the 16th international conference on Supercomputing (2002) (7)
- ViPZonE: Hardware Power Variability-Aware Virtual Memory Management for Energy Savings (2015) (7)
- A development environment for scientific parallel programs (1986) (7)
- Managing Cross-Layer Constraints for Interactive Mobile Multimedia∗ (2003) (7)
- An environment for the development of microcode for pipelined architectures (1992) (6)
- Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming (2013) (6)
- An efficient, global resource-directed approach to exploiting instruction-level parallelism (1996) (6)
- Compiler-Directed Cache Assist Adaptivity (2000) (6)
- Caching values in the load store queue (2004) (6)
- Energy analysis of multimedia watermarking on mobile handheld devices (2005) (6)
- Optimizing Program Performance via Similarity, Using a Feature-Agnostic Approach (2013) (6)
- A general approach for partitioning N-dimensional parallel nested loops with conditionals (2006) (6)
- Resource-Directed Loop Pipelining (1996) (6)
- Parallel Language and Compiler Research in Japan (1995) (6)
- Short-Circuit Compiler Transformation: Optimizing Conditional Blocks (2007) (6)
- High-Level Synthesis with SDRAMs and RAMBUS DRAMs (Special Section on VLSI Design and CAD Algorithms) (1999) (6)
- Simultaneous Way footprint Prediction and Branch Prediction for Energy Savings in Set associative Instruction Caches (2001) (6)
- An Approach to Combine Predicated/Speculative Execution for Programs with Unpredictable Branches (1994) (6)
- An Efficient Approach for Self-scheduling Parallel Loops on Multiprogrammed Parallel Computers (2005) (6)
- Percolation scheduling for non-VLIW machines (1990) (6)
- A method for register allocation to loops in multiple register file architectures (1996) (6)
- New directions in compiler technology for embedded systems (embedded tutorial) (2001) (6)
- Memory Subsystem Description in EXPRESSION (2000) (6)
- Selected papers of the second workshop on Languages and compilers for parallel computing (1990) (6)
- Harmonic scheduling of linear recurrences for digital filter design (1992) (6)
- Control flow optimization in loops using interval analysis (2008) (5)
- A hypergraph-based model for port allocation on multiple-register-file VLIW architectures (1995) (5)
- Applying an Abstract Data Structure Description Approach to Parallelizing Scientific Pointer Programs (1992) (5)
- Using Hardware Counters to Predict Vectorization (2017) (5)
- Impact of JVM superoperators on energy consumption in resource-constrained embedded systems (2008) (5)
- Customizing Software Toolkits for Embedded Systems-On-Chip (2000) (5)
- Using a Way Cache to Improve Performance of Set-Associative Caches (2005) (5)
- Proxy-based task partitioning of watermarking algorithms for reducing energy consumption in mobile devices (2004) (5)
- Novel Brain-Derived Algorithms Scale Linearly with Number of Processing Elements (2007) (5)
- Brain Derived Vision Algorithm on High Performance Architectures (2009) (5)
- Cache-aware partitioning of multi-dimensional iteration spaces (2009) (5)
- Cache-aware iteration space partitioning (2008) (5)
- Enhanced Loop Coalescing: A Compiler Technique for Transforming Non-uniform Iteration Spaces (2005) (5)
- Content-aware Power Optimizations for Multimedia Streaming Over Wireless Networks (2006) (5)
- Annotation-aware dynamic compilation and interpretation (2002) (5)
- How Many Threads to Spawn during Program Multithreading? (2010) (4)
- An annotation‐aware Java virtual machine implementation (2000) (4)
- Instruction Level Parallelism (2016) (4)
- Minimization of Memory Tra c in High-Level Synthesis (1995) (4)
- Improving numerical accuracy for non-negative matrix multiplication on GPUs using recursive algorithms (2013) (4)
- A predictive decode filter cache for reducing power consumption in embedded processors (2007) (4)
- Using Data Dependence Analysis and Loop Transformations to Teach Vectorization (2017) (4)
- The Design of the PROMIS Compiler—Towards Multi-Level Parallelization (2000) (4)
- Video Stream Annotations for Energy Trade-offs in Multimedia Applications (2006) (4)
- Parallelizing tightly nested loops (1991) (4)
- High-Level Synthesis of Scalable Architectures for IIR Filters Using Multichip Modules (1993) (4)
- Adaptive Line Size Cache (1999) (4)
- Just in Time Load Balancing (2012) (4)
- Miss Elimination by Time stride Prefetch (2000) (4)
- Improving the Accuracy of High Performance BLAS Implementations Using Adaptive Blocked Algorithms (2011) (4)
- Specification of Hazards, Stalls, Interrupts, and Exceptions in EXPRESSION (2001) (4)
- Improving accuracy for matrix multiplications on GPUs (2011) (4)
- Research Directions in Compiling For Massive Parallelism (1992) (3)
- Partitioning of Variables for Multiple-Register-File Architectures via Hypergraph Coloring (1994) (3)
- On the efficacy of call graph-level thread-level speculation (2010) (3)
- ServiceFORGE : A Software Architecture for Power and Quality Aware Services (2003) (3)
- Symbolic Analysis in the PROMIS Compiler (1999) (3)
- Using Annotations to Facilitate Power vs Quality Trade-offs in Streaming Applications ∗ (2005) (3)
- Partitioning of Variables for Multiple-Register-File VLIW Architectures (1994) (3)
- Parallelizing Non-Vectorizable Loops for MIMD Machines (1990) (3)
- A radiative transfer module for calculating photolysis rates and solar heating in climate models: Solar-J v7.5 (2017) (3)
- Exploring scalable schedules for IIR filters with resource constraints (1999) (3)
- A development environment for horizontal microcode programs (1986) (3)
- A Data Cache with Dynamic Mapping (2003) (3)
- Data dependence testing in the presence of pointers and pointer-based data structures (1998) (3)
- A Hierarchical Parallelizing Compiler for VLIW/MIMD Machines (1992) (3)
- DPCS (2015) (2)
- EdgeAvatar: An Edge Computing System for Building Virtual Beings (2021) (2)
- Fine grain software pipelining of non-vectorizable nested loops (1991) (2)
- A compiler-driven supercomputer (1986) (2)
- Cache with Adaptive Fetch Size (2000) (2)
- N-dimensional perfect pipelining (1992) (2)
- Automatic program parallelization : Languages and compilers (1993) (2)
- Towards an Achievable Performance for the Loop Nests (2018) (2)
- Getting High Performance with Slow Memory (1986) (2)
- Optimizing control flow in loops using interval and dependence analysis (2009) (2)
- Languages and Compilers for Parallel Computing: 7th International Workshop, Ithaca, NY, USA, August 8 - 10, 1994. Proceedings (1995) (2)
- Analyzing the individual/combined effects of speculative and guarded execution on a superscalar architecture (1998) (2)
- Scalable techniques for computing band linear recurrences on massively parallel and vector supercomputers (1994) (2)
- Line size adaptivity analysis of parameterized loop nests for direct mapped data cache (2005) (2)
- Percolation scheduling with resource constraints (1989) (2)
- Languages and Compilers for Parallel Computing: 5th International Workshop, New Haven, Connecticut, Usa, August 3-5, 1992 : Proceedings (1993) (2)
- Aggressive Memory-Aware Compilation (2000) (2)
- Fine-grain compilation for pipelined machines (1988) (2)
- Probability based power aware error resilient coding (2005) (2)
- Comparative characterization of SPEC CPU2000 and CPU2006 on Itanium® architecture (2007) (2)
- Accelerating Brain Circuit Simulations of Object Recognition with CELL Processors (2007) (2)
- Polygonal Iteration Space Partitioning (2016) (2)
- A hierarchical approach to instruction-level parallelization (1995) (2)
- Microflow: A Fine-Grain Parallel Processing Approach (1985) (2)
- Overview of ILP Architectures (2016) (2)
- EXPRESSION User Manual version 1.0 (2003) (1)
- Proceedings of the 23rd international conference on Supercomputing (2009) (1)
- Harmonic Scheduling: A Technique for Scheduling beyond Loop-Carried Dependencies (1993) (1)
- Software Pipelining by Kernel Recognition (2016) (1)
- Pretty Good Accuracy in Matrix Multiplication with GPUs (2010) (1)
- Selective search of inlining vectors for program optimization (2012) (1)
- A Model-Based Approach to System Specification for Distributed Real-time and Embedded Systems * (2004) (1)
- Dynamically adaptive fetch size prediction for data caches (2003) (1)
- Detecting COVID-19 Related Pneumonia On CT Scans Using Hyperdimensional Computing (2021) (1)
- Languages and Compilers for Parallel Computing: 6th International Workshop, Portland, Oregon, USA, August 12 - 14, 1993. Proceedings (1994) (1)
- Languages and Compilers for Parallel Computing (2001) (1)
- NumbaSummarizer: A Python Library for Simplified Vectorization Reports (2020) (1)
- A Case for an Adaptive and Opportunistic Variability- Aware Memory Virtualization Layer (2011) (1)
- A global resource-constrained parallelization technique (1989) (1)
- JuliusC: A Practical Approach for the Analysis of Divide-and-Conquer Algorithms (2004) (1)
- On-Chip vs. Off-Chip Memory: Utilizing Scratch-Pad Memory (1999) (1)
- Modulo Scheduling and Loop Pipelining (2011) (1)
- On the evaluation and extraction of thread-level parallelism in ordinary programs (2008) (1)
- MCompiler: A Synergistic Compilation Framework (2019) (1)
- PBPAIR : Probability Based Power Aware Intra Refresh A New Energy-efficient Error-resilient Encoding Scheme * (2005) (1)
- Parallelization of programs containing loop-carried dependences with resource constraints (1994) (1)
- Power-Aware Multimedia Streaming in Heterogeneous Multi-User Environments ∗ (2003) (1)
- Effective Evaluation of Multi-core Based Systems (2013) (1)
- Pruning Hardware Evaluation Space via Causality-Driven Application Similarity Analysis (2010) (1)
- Path Collection and Dependence Testing in the Presence of Dynamic, Pointer-Based Data Structures (1995) (1)
- Compiler-inthe-Loop ADL-driven Early Architectural Exploration ∗ (2005) (1)
- AFFIX (2019) (1)
- CECS TR Coversheet New (2015) (0)
- 1 A Framework for GUI-driven Design Space Exploration of a MIPS 4 K-like processor † (2003) (0)
- Low-level programming for a massively parallel fine-grain computer: the Microflow approach (1987) (0)
- The x-legion: a compiler-approach to exploit locality and portability of divide-and-conquer algorithms (2005) (0)
- Falcon: a Matlab Interactive Restructuring Compiler Falcon: a Matlab Interactive Restructuring Compiler (1995) (0)
- From the guest editors (2007) (0)
- High-Level Synthesis of Scalable Architectures for IIR Filters Using Parameterized MCM ' s y Haigeng (1992) (0)
- Copy Elimination for Parallelizing Compilers (1998) (0)
- A new technique for induction variable removal (1991) (0)
- Algorithms of All Pair Shortest Path Problem (2016) (0)
- Register allocation issues in embedded code generation (1998) (0)
- Editors' introduction (2007) (0)
- New Opportunities for Compilers in Computer Security (2018) (0)
- Performance Characterization of Itanium® 2-Based Montecito Processor (2009) (0)
- Concurrent Information Processing and Computing (NATO Science) (2005) (0)
- Architecture exploration of parameterizable EPIC SOS architectures (poster paper) (2000) (0)
- Selective Guarded Execution Using Pro ling on a DynamicallyScheduled ProcessorSrinivas (2007) (0)
- Compile time vs. runtime: scheduling parallelism on dataflow machines (1989) (0)
- Notice : This Material may be protected by Copyright Law ( Title 17 U . S . C . ) HIERARCHICAL PARALLELISM EXPLOITATION (2015) (0)
- A Heterogeneous Solution to the All-pairs Shortest Path Problem using FPGAs (2022) (0)
- Editor's Announcement (1998) (0)
- Memory Architecture Exploration (1999) (0)
- Is computer science dying? (2016) (0)
- Regular schedules for scalable design of IIR filters (1993) (0)
- Software pipelining of non-vectorizable loosely nested loops (1991) (0)
- Fault Tolerance for FPUs via Vectorization (2015) (0)
- Massive Parallelism and Fine-Grain Parallelism: are They Incompatible? (1993) (0)
- Scheduling Basic Blocks (2016) (0)
- Disambiguation, Correctness and Flow-Analysis Issues for Trace Scheduling Compilers (1984) (0)
- Hardware App 1 App 2 App N Cross-Layer Sensors ( Virtual & Physical ) Decisions & Learning ( Controller ) Actuation ( software and hardware ) (2014) (0)
- Session details: Optimizing parallel applications (2009) (0)
- Optimizing Program Performance via Similarity, Using Feature-aware and Feature-agnostic Characterization Approaches (2013) (0)
- Leveraging profile-selected execution patterns for optimized code execution in resource-constrained systems (2010) (0)
- Selective guarded execution using profiling on a dynamically scheduled processor (1999) (0)
- Acknowledgment to Reviewers (1978) (0)
- A Unified Power Management Framework for Distributed Video Streaming to Mobile Devices (2004) (0)
- Comparative characterization of SPEC CPU2000 and CPU2006 on Itanium architecture (2007) (0)
- MIMD Lattice Computation (2011) (0)
- Author retrospective for a global resource-constrained parallelization technique (2014) (0)
- How Do We Make Parallel Processing a Reality? Bridging the Gap Between Theory and Practice (1991) (0)
- PBExplore : A Compiler-inthe-Loop Framework for Design Space Exploration of Partially Bypassed Processor Pipelines User Manual Version 1 . 0 01 / 15 / 2008 (2008) (0)
- 1 P Ports P Ports 2 P Portsk Mem 1 Mem 2 Mem k Functional Unit MUnit 1 Functional (0)
- JJarmonic Scheduling of Linear Recurrences in Digital Filter Design ___ (2015) (0)
- Architecture exploration of parameterizable EPIC SOC architectures (2000) (0)
- Incremental tree height reduction for code compaction (1990) (0)
- Multi-Layer Memory Resiliency Invited Paper in Special Session "Embedded Resiliency: Approaches for the Next Decade" (2014) (0)
- If software is king for systems-on-silicon, what's new in compilers? (1997) (0)
- Hierarchical parallelism exploitation (1989) (0)
- Low Energy Associative Data Caches for Embedded Systems (2003) (0)
- A Compilation and Run-Time Framework for Maximizing Performance of Self-scheduling Algorithms (2014) (0)
- Operation Tables for Scheduling in the Presence of Incomplete Bypassing (2004) (0)
- A Simplified Java Compilation System for Resource-Constrained Embedded Processors ∗ (2007) (0)
- Proceedings of the 2016 International Conference on Supercomputing (2009) (0)
- Load Balancing with Polygonal Partitions (2018) (0)
- Welcome to ICS'02 (2002) (0)
- Annotation Integration and Trade-off Analysis for Multimedia Applications (2007) (0)
- Off-Chip Memory Access Optimizations (1999) (0)
- W Operating System DVS Scheduler Network Management Transcoding Admission Control Applications Video Player Other Tasks Middleware (2003) (0)
- Distributed Multimedia Streaming in a Heterogeneous Environment ∗ (2003) (0)
- Branch optimizations and instruction-level parallelism exploitation for dynamic superscalar and vliw processors (2000) (0)
- Case Study: MPEG Decoder (1999) (0)
- Microprogramming research projects at Cornell University (1986) (0)
- Proceedings of the NATO Advanced Research Workshop on Advanced Environments, Tools, and Applications for Cluster Computing-Revised Papers (2001) (0)
- A Spill Code Minimization Algorithm for Loops* ----- (2015) (0)
- Comparison of Compacting for Garbage Collection (1983) (0)
- Memory Resiliency Invited Paper in Special Session Resiliency: Approaches for the Next Decade" (2014) (0)
- The effects of predicated execution on architectures supporting dynamic speculation (1998) (0)
- cpsoc-codes+isss2014-splsession (2015) (0)
- COPPER: COMPILER-CONTROLLED ON-DEMAND APPROACH TO POWER-EFFICIENT COMPUTING (2003) (0)
- Fetch Size Adaptation vs. Stream Buuer for Media Benchmarks 1 Contents 1 Introduction 1 2 Related Work 1 3 Impacts of Variable Fetch Sizes and Stream Buuer 2 4 Fetch Size Adaptation 6 (2001) (0)
- A Systematic Approach to Branch Speculation (1997) (0)
- Speculative Execution by Compiler Supported Branch Prediction Hardware (1996) (0)
- Proceedings of the 16th international conference on Supercomputing, ICS 2002, New York City, NY, USA, June 22-26, 2002 (2002) (0)
- ROPE: A New Twist in Computer Architectures (1987) (0)
- Paradigm with Cross-Layer Virtual Sensors and Actuators (2013) (0)
- Fine-grain parallelization versus the wavefront method (1989) (0)
- Ultra fine-grain template-driven synthesis (1994) (0)
- Data Organization: The Processor Core/Cache Interface (1999) (0)
- Content annotation for power and quality trade-offs in mobile multimedia systems (2007) (0)
- Fault Tolerant Scheduling for Parallel Loops on Shared Memory Systems (2015) (0)
- Probablistic Self-Scheduling (2006) (0)
- Fine-grain loop scheduling for MIMD machines (1990) (0)
- Tutorial 2, parallel proessing : architecture and software : the 17th Annual International Symposium on Computer Architecture (1990) (0)
This paper list is powered by the following services:
What Schools Are Affiliated With Alexandru Nicolau?
Alexandru Nicolau is affiliated with the following schools: